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Abstract 

Since  its  introduction,  the  Ant  Colony  Optimization  (ACO)  meta-heuristic  has  been 
successfully  applied  to  a  wide  range  of  combinatorial  problems.  This  thesis  presents  the 
adaptation  of  ACO  to  a  new  NP-hard  problem  involving  the  replication  of  multi-quality 
database-driven  web  applications  (DAs)  by  a  large  application  service  provider  (ASP).  This 
problem  is  a  special  case  of  the  generalized  assignment  problem  (GAP)  which  occurs  in 
many  military  contexts  such  as  logistics  planning,  air  crew  scheduling,  and  communications 
network  management. 

The  ASP  must  assign  DA  replicas  to  its  network  of  heterogeneous  servers  so  that  user 
demand  is  satisfied  at  the  desired  quality  level  and  replica  update  loads  are  minimized. 

The  ACO  algorithm  proposed,  AntDA,  for  solving  the  ASP’s  replication  problem  is 
novel  in  several  respects:  ants  traverse  a  bipartite  graph  in  both  directions  as  they  construct 
solutions,  pheromone  is  used  for  traversing  from  one  side  of  the  bipartite  graph  to  the  other 
and  back  again,  heuristic  edge  values  change  as  ants  construct  solutions,  and  ants  may 
sometimes  produce  infeasible  solutions. 

Although  experiments  show  that  AntDA  outperforms  several  other  solution  meth¬ 
ods,  there  was  room  for  improvement  in  the  convergence  rates  of  the  ants  in  finding  better 
solutions.  Therefore,  in  an  attempt  to  achieve  the  goals  of  faster  convergence  and  better 
solution  values  for  larger  problems,  AntDA  was  combined  with  the  variable-step  policy 
hill-climbing  algorithm  called  Win  or  Leam  Fast  (WoLF).  In  experimentation,  the  addition 
of  this  learning  algorithm  in  AntDA  provided  for  faster  convergence  and  still  outperformed 
the  other  solution  methods.  However,  as  problem  complexity  rose,  AntDA  with  the  WoLF 
algorithm  converged  to  statistically  significant  lesser  solutions  than  those  found  by  AntDA, 
but  at  a  much  faster  rate. 
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Optimizing  the  Replication  of 


Multi-Quality  Web  Applications 
Using  ACO  and  WoLF 

I.  Introduction 

1.1  Motivation 

With  the  rise  of  the  Internet  and  advances  in  services  desired  by  users,  companies 
with  websites  are  finding  it  more  economical  to  rent  a  network  from  a  provider  rather  than 
purchasing  and  maintaining  their  own.  In  order  to  do  this,  they  turn  to  Application  Service 
Providers  (ASPs).  This  relatively  new  business  partnership  introduces  interesting  technical 
problems  for  both  the  company  and  the  ASP. 

This  problem  is  a  special  case  of  the  generalized  assignment  problem  (GAP)  which 
occurs  in  many  military  contexts  such  as  logistics  planning,  air  crew  scheduling,  and  com¬ 
munications  network  management. 

With  growth  in  size  and  number  of  users,  making  content  widely  available  while 
reducing  the  load  on  the  web  servers  becomes  a  major  challenge.  Users  want  the  appli¬ 
cations  and  servers  they  use  to  be  available  at  all  times  and  with  short  response  times  for 
those  requests.  A  single  web  server  can  not  handle  all  of  this  traffic.  Therefore,  they  create 
copies  of  their  content,  called  replicas,  and  spread  them  all  over  the  internet  using  an  ASPs 
expansive  server  network.  In  most  cases,  user  requests  for  an  application  are  received  by 
the  nearest  ASP  where  the  replicated  logic  interprets  and  processes  them.  Those  requests 
needing  information  from  the  back-end  databases  are  passed  to  the  owner’s  data  center  for 
further  processing.  This  combination  of  an  application  and  its  database  is  referred  to  as  a 
database  application  or  DA  for  short  (see  Figure  1). 

There  would  be  no  problem  if  the  back-end  databases  data  never  changed.  However, 
users  of  most  applications  require  a  certain  quality  of  data  freshness.  The  quality  of  data 
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Back-end  database  servers 
Figure  1:  An  example  e-commerce  site. 


freshness  depends  on  the  freshness  of  the  data  in  the  database  and  the  higher  the  quality  of 
freshness  requested,  the  more  frequently  the  database  needs  updating. 

Although  the  quality  aspect  allows  for  some  flexibility  in  replicating  DAs,  it  presents 
some  significant  issues.  For  one,  user  demand  for  a  DA’s  content  is  unpredictable  and  sud¬ 
den  surges  in  DA  demand  occur.  Second,  updating  a  DA  replica  with  fresh  data  diminishes 
the  replica’s  capacity  for  handling  end-user  requests.  If  not  managed  effectively,  this  para¬ 
sitic  update  load  could  cause  more  replicas  to  be  created  than  needed.  Finally,  the  back-end 
databases  of  DA  replicas  can  be  huge  and  take  a  non-trivial  amount  of  time  to  move  and 
reestablish  replicas. 

The  core  problem  dealt  with  in  this  thesis  investigation  is  that  of  an  ASP’s  assignment 
of  replicas  of  its  customers’  DAs  on  its  network  of  servers  so  as  to  satisfy  user  demand  (in¬ 
cluding  the  appropriate  quality)  for  the  DAs  while  minimizing  the  parasitic  database  update 
load  of  the  DA  replicas.  This  problem  is  known  as  the  Quality-Sensitive  DA  Replication 
Problem,  or  DA  rep  for  short. 


2 


1.2  Solution  Approach 

In  order  to  solve  the  DA  rep  problem,  an  ant  colony  optimization  algorithm,  AntDA 
(originally  proposed  in  [53]),  is  investigated.  First,  AntDA’s  parameters  are  tuned  and  it  is 
tested  against  many  different  test  cases.  Results  show  that  AntDA  performs  better  than  the 
other  search  algorithms  tested  but  had  higher  solution  execution  times. 

In  order  to  minimize  the  drawback  of  higher  solution  times,  AntDA  is  combined  with 
a  variable-step  policy  hill-climbing  algorithm  called  Win  or  Learn  Fast  (WoLF).  Two  dif¬ 
ferent  definitions  of  the  WoLF  algorithm  is  experimented  with  to  produce  two  variations  of 
AntDA:  WoLFAntDA  and  PD-WoLFAntDA.  The  addition  of  the  WoLF  learning  algorithm 
into  AntDA  allows  the  ACO  heuristic  to  be  applied  to  more  complex  problems  while  still 
being  solved  in  a  reasonable  amount  of  time. 

1.3  Thesis  Organization 

The  problem  of  replicating  Web-based  applications  and  copies  of  their  associated 
databases  (DA  replicas)  that  are  being  updated  in  order  to  meet  the  quality  demands  of 
its  users  requests  is  the  Quality-Sensitive  DA  Replication  Problem  (the  DA  rep  problem). 
This  thesis  effort  examines  this  one  aspect  of  replicating  and  delivering  differing  quality 
Internet  content  and  presents  different  approaches  for  solving  the  problem  and  effective 
implementation. 

The  remainder  of  this  document  is  organized  in  the  following  manner.  Chapter  2 
delves  deeper  into  the  background  of  the  Quality-Sensitive  DA  Replication  Problem.  The 
first  part  of  the  chapter  describes  the  challenges  of  the  DA  rep  problem  in  detail  as  well  as 
a  non-mathematical  definition.  It  also  presents  a  formalized  variation  of  DA  rep  in  which 
servers  process  updates  and  requests  with  varying  efficiencies.  The  following  sections  pro¬ 
vide  definitions  of  other  assignment  problems  as  well  as  an  in  depth  description  of  the  Ant 
Colony  Optimization  algorithm  as  well  as  the  Win  or  Learn  Fast  algorithm  which  were 
adapted  to  solve  the  DA  rep  problem  and  evaluated  experimentally.  Chapter  3  provides  the 
insight  into  how  the  experiments  are  conducted  with  descriptions  of  how  the  Ant  Colony 
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Optimization  and  Win  or  Leam  Fast  algorithms  were  adapted  to  fit  the  DA  rep  problem.  It 
also  describes  briefly  three  other  search  algorithms  for  solving  assignment  problems  that 
were  used  for  comparison.  Chapter  4  contains  the  initial  performance  results  for  AntDA 
and  explains  the  effects  of  changes  and  alternatives  to  the  AntDA  algorithm  that  were  made 
in  order  to  improve  performance.  It  concludes  with  the  performance  results  of  WoLFAntDA 
and  a  comparison  of  its  effect  on  the  AntDA  algorithm.  The  conclusion  of  the  main  body 
of  this  thesis  is  in  Chapter  5.  It  contains  a  summary  of  the  main  contributions  of  this  thesis 
and  avenues  for  further  research. 
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II.  Background  and  Related  Work 

2.1  Introduction 

This  chapter  provides  an  overview  of  the  background  and  related  work  on  the  Quality- 
Sensitive  DA  Replication  Problem  and  Ant  Colony  Optimization.  It  starts  out  with  a  de¬ 
scription  of  the  Quality-Sensitive  DA  Replication  Problem  and  its  environment  and  dis¬ 
cusses  other  assignment  problems  related  to  the  Quality-Sensitive  DA  Replication  Problem. 
The  Ant  Colony  Optimization  technique  is  then  explained,  followed  by  an  in  depth  review 
of  the  Win  or  Leam  Fast  algorithm. 

2.2  Problem  Background 

This  section  presents  a  deeper  introduction  to  the  environment  of  the  Quality-Sensitive 
DA  Replication  Problem  and  a  non-mathematical  problem  statement.  The  problem  is  then 
formulated  mathematically  as  a  0-1  assignment  problem  [53]  and  is  proven  to  be  NP-hard 
even  when  each  DA  in  the  system  has  only  one  quality-level 

2.2.1  The  ZMrcp  Environment.  With  the  rise  of  the  internet,  web-based  applica¬ 
tions  have  become  very  complicated.  The  applications  and  services  are  now  implemented 
as  a  combination  of  application  logic  that  takes  user  requests  and  talks  to  the  back-end 
databases  to  acquire  the  data  content  necessary  to  generate  the  appropriate  response.  These 
systems  are  referred  to  as  Database  Applications,  or  DAs  for  short,  and  are  commonly  seen 
in  e-commerce  and  e-business  (on-line  stores,  auctions,  news  services,  banking,  etc.)  (see 
the  Fig.  1  on  page  2).1  However,  as  Table  1  shows  this  web-centric  model  of  DAs  is  not 
fundamentally  different  from  that  found  in  other  contexts  such  as  scientific  grid  computing 
[4,5,39,44]  and  data  warehousing  [67,68]. 

With  the  explosion  of  internet  users  in  the  late  1990’s  and  today,  some  database  ap¬ 
plications  have  become  too  popular  for  owner’s  to  be  able  to  keep  up  the  infrastructure  nec¬ 
essary  to  handle  the  traffic.  Consequently,  these  owners  have  turned  to  application  service 

1  Notional  lv  a  server  is  a  single  computer  or  a  group  of  computers  working  in  concert  to  implement  a  DA 
or  a  DA  replica.  To  simplify  matters,  this  document  hides  this  multiple-computer  notion  of  server  by  always 
assuming  a  server  is  a  single  computer. 
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Application  Replicas 


Figure  2:  A  Typical  Application  and  Database  (DA)  Replica  Setup.  DA  replicas  con¬ 
sist  of  an  application  servers  and  local  read-only  database  replicas.  When  reading  data  to 
fulfill  a  user  request,  an  application  server  accesses  its  local  database.  When  generating 
or  changing  data,  the  master  database  is  contacted.  The  frequency  at  which  the  master 
database  synchronizes  its  database  replicas  determines  the  service  quality  provided  by  the 
replica. [53] 

providers  (ASPs),  such  as  Akamai  [2]  and  ASP-One  [6],  in  order  to  relieve  some  of  their  ap¬ 
plication’s  workload  by  distributing  the  application  logic  onto  the  ASP’s  extensive  network 
of  servers.  This  also  relieves  the  owner  of  handling  infrastructure  needs  such  as  hardware, 
network,  backup,  security,  and  operating  systems,  making  the  owner’s  process  much  more 
manageable.  In  most  cases,  requests  for  the  application  are  received  by  the  ASP  where  the 
replicated  logic  interprets  and  processes  them.  Those  requests  needing  information  from 
the  back-end  databases  are  passed  to  the  owner’s  data  center  for  further  processing.  By 
doing  this,  access  to  the  database  often  becomes  the  major  performance  bottleneck  of  the 
process  [65].  Therefore,  replicating  just  the  application  logic  may  be  insufficient.  In  order 
for  an  owner  to  receive  maximum  benefit,  the  database  must  be  replicated  too  [48, 58]. 
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Description 

Similarity 

Web-based  DA 

Data  Warehousing 

Scientific  Grid  Comput¬ 
ing 

Goal 

Assign  DA  replicas  to 
servers,  direct  requests  to 
appropriate  replica,  avoid 
DBMS  overload 

Place  DW  replicas  on 
servers,  data  mine  on  ap¬ 
propriate  replicas,  avoid 
DBMS  overload 

Assign  compute  tasks  to 
servers  avoid  DBMS  over¬ 
load  at  servers 

Servers 

Hundreds  of  servers  each 
with  its  own  capacity  limit 

Hundreds  of  servers  each 
with  its  own  capacity  limit 

Hundreds  of  servers  each 
with  its  own  capacity  limit 

Bottleneck 

DBMS 

DBMS 

Computing  tasks  and 
DBMS 

Data 

Large  databases  generate 
HTML 

Large  Databases  store 
warehouse  data 

Experiment  data  in  mas¬ 
sive  Databases 

Load  Sources 

User  requests  and  replica 
database  synchronization 

Data  mining  operations 
and  new  data  added  to 
DW 

Tasks  query  database  and 
generate  new  data. 

Service  Qualities 

Data  freshness  require¬ 
ments  of  users  create  qual¬ 
ity  levels 

Levels:  managers  (low) 
and  data  analysts  (high) 

Levels:  making  hypothe¬ 
sis  (low)  and  proving  hy¬ 
pothesis  (high) 

Table  1:  DAs,  Data  Warehousing,  and  Grid  Computing  Compared  [53]. 


Recent  advances  in  database  caching  and  update  propagation  have  made  replicating 
the  database  portion  of  multi-tiered  web  application  (Fig.  2)  more  feasible  [17, 18,21,28, 
47,49,51,55].  Nevertheless,  database  replication  still  has  many  issues.  Most  importantly, 
the  database  replicas  must  be  periodically  updated  so  that  their  content  is  timely  or  fresh. 
Updating  a  database  replica  with  fresh  data,  strips  the  replica  of  capacity  for  handling  end- 
user  request  and  may  cause  the  need  for  many  more  replicas.  Also,  databases  are  normally 
many  gigabytes  in  size  and  can  take  a  great  deal  of  time  to  move  and  establish  replicas. 

One  of  the  main  difficulties  faced  by  an  application  service  provider  (ASP)  is  the 
decision  of  where  to  assign  replicas  of  its  customers’  DAs  on  its  network  of  servers.  In 
doing  this,  it  must  consider  how  to  best  meet  user  demand  for  the  DAs  and  keep  the  cost  of 
database  updates  for  the  DA  replicas  minimized.  Users  having  differing  expectations  about 
the  timeliness  or  freshness  of  the  content  they  receive  severely  complicate  this  assignment 
problem  [13,22].  In  other  words,  users  may  have  service  quality  requirements  that  have 
to  be  met.  The  implication  of  this  is  two-fold.  First,  not  every  DA  replica  must  operate 
at  the  highest  quality  level  as  there  may  be  users  which  are  happy  with  a  lower  quality. 
Secondly,  a  request  must  be  served  from  a  replica  (a  copy  of  the  application  logic  and 
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(a) 


(c) 


Figure  3:  Distributed  Environments  Requiring  Effective  Replication  Solutions.  Shown 
are  examples  of  (a)  Grid  computing  domains,  (b)  data  warehousing  operations,  (c)  applica¬ 
tion  service  providers  replicating  dynamic  Web  sites  [53]. 


relevant  portions  of  the  database)  that  meets  or  exceeds  the  user’s  quality  requirement.  The 
ASP’s  problem  of  deciding  on  replica-to-server  assignments  of  such  quality-differentiated 
DAs  is  the  Quality-Sensitive  DA  Replication  Problem,  or  DA  rep  for  short. 


2.2.2  Quality-Sensitive  DA  Replication  Problem  Assignment  Issues.  For  the 
Quality-Sensitive  DA  Replication  Problem  (DArep),  there  are  several  issues  that  must  be 
taken  into  account  when  assigning  replicas  to  servers:  application  masters  and  replica 
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slaves,  dynamic  content,  keeping  replicas  fresh,  large  databases,  and  DBMS  load  and 
replica  server  response  times. 

Application  Masters  and  Replica  Slaves:  A  trend  in  DA  architecture  and  replication  is 
the  user  of  the  master/slave  relationship  for  updating  the  database  application.  In  operation, 
there  is  a  single  master  and  zero  or  more  slaves  (or  replicas).  Both  masters  and  slaves 
contain  a  database  component  and  are  capable  of  receiving  and  handling  user  requests. 
When  a  slave  needs  to  update  its  database,  it  contacts  the  master  which  processes  the  update 
and  propagates  the  appropriate  changes  to  the  slaves,  thereby  refreshing  their  databases. 
Oracle’s  Database  Cache  and  IBM/DB2  support  master/slave  replication  [19]. 

Dynamic  Content:  While  improvements  in  dynamic  content  caching  have  been  made  and 
techniques  exist  to  synch  masters  and  slaves  [17, 18,21,28,47,49,51,55],  caching’s  benefits 
are  limited,  meaning  that  ample  access  to  source  data  is  always  required.  Replication  can 
provide  this  ample  access.  An  application  and  a  relevant  portion  of  its  database  can  be 
replicated  by  a  service  provider  as  is  done  by  Akamai  using  IBM’s  WebSphere  product 
[42].  Figure  2  shows  the  replication  of  a  single  DA.  An  important  concept  to  remember 
is  that  updates  occur  at  a  master  database  which  propagates  changes  to  database  replicas 
based  on  the  replicas’  freshness  requirements. 

Keeping  Replicas  Fresh:  Database  replicas  have  to  be  regularly  updated  so  that  their  con¬ 
tent  is  timely  or  fresh.  Assigning  a  DA  replica  to  a  server  induces  a  continuous  update 
load  on  the  server’s  database  component  due  to  the  frequent  updates  required  to  maintain 
the  replica’s  service  quality.  This  update  load  is  parasitic  in  the  sense  that  it  reduces  the 
replica’s  capacity  for  handling  end-user  requests.  This  resource  drain  also  excludes  plans 
of  creating  more  replicas  than  demand  warrants  and  limits  how  many  replicas  a  server  can 
host.  A  higher  quality  of  service  requires  fresher  data  and  require  more  frequent  database 
synchronization.  Therefore,  update  load  on  a  replica  increases  with  freshness  or  service 
quality.  Understandably,  update  load  mitigation  has  been  the  subject  of  much  research 
[17,18,20,46,49,52,57], 
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Large  Databases:  The  database  of  a  DA  is  normally  many  gigabytes  large  and  can  take  a 
great  deal  of  time  to  move  and  establish  replicas  in  response  to  changing  demand. 

DBMS  Load  and  Replica  Server  Response  Times:  Response  times  are  the  crucial  mea¬ 
surement  of  speed  in  today’s  internet,  especially  for  e-commerce  applications,  since  slow 
response  times  translate  into  unhappy  customers  and  lost  revenue  [47,65].  Since  request 
and  response  sizes  are  small  and  propagate  quickly  over  today’s  internet,  the  performance 
bottleneck  for  database-driven  applications  has  been  shown  to  be  the  Database  Management 
System  (DBMS).  DA  response  times  depend  greatly  on  database  load,  and  not  necessarily 
the  placement  of  the  replicas  geographically  close  to  users  or  network  delays  [18,28,47,65]. 
The  database  load  of  a  DA  replica  has  two  components:  request  load  and  update  load.  Re¬ 
quest  load  results  from  queries  stemming  from  user  requests.  Update  load  is  the  resources 
consumed  in  synchronizing  a  replica’s  slave  database  with  its  master  database.  Therefore,  if 
the  DBMS  can  keep  the  request  and  update  loads  from  overloading  the  database,  response 
times  will  be  managable. 

2.2.3  A  N on-mathematical  Definition  of  the  DA  Problem.  The  concepts  presented 
up  to  this  point  are  used  to  define  the  Quality- Sensitive  DA  Replication  Problem,  DA  rep 
as  shown  in  Fig.  4.  All  terms  used  in  the  definition  were  defined  previously  in  this  chapter. 

Figure  5  contains  an  ASP  replication  scenario  that  fits  the  DA  rep  definition.  Cus¬ 
tomers  maintain  master  databases  from  which  updates  are  disseminated.  Once  replicas 
are  assigned  to  the  ASP’s  servers,  the  database  replicas  are  synchronized  with  customers’ 
master  databases  so  as  to  maintain  each  replica’s  designated  service  quality  (Figs.  2  and  5). 
Since  users  access  replicas,  which  are  read-only,  any  request  that  changes  a  DA’s  database 
is  routed  through  the  master  database  and  then  disseminated  to  the  replicas.  Replica  up¬ 
date/synchronization  is  costly  in  terms  of  resources,  and  hence,  has  to  be  minimized  while 
maintaining  the  appropriate  freshness  levels. 

This  thesis  presents  and  evaluates  several  algorithms  for  solving  various  forms  of 
Quality-Sensitive  DA  Replication  Problem. 
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The  Quality-Sensitive  DA  Replication  Problem 

Given 

•  A  set  of  DAs  to  be  replicated  where  each  DA  has  one  or  more  freshness  quality  levels  at 
which  it  operates. 

•  A  set  of  servers  on  which  DAs  can  be  placed.  Each  server  has  a  (possibly  different)  load 
limit.  Zero  or  more  DAs  can  be  hosted  on  server. 

•  Request  rates  for  DA  content  and  freshness  levels  for  each  web  site  to  be  replicated. 

•  Operating  loads: 

-  Update  load:  The  load  of  maintaining  a  DA  replica  at  a  certain  freshness  level  is  the 
product  of  an  update  rate  and  DA  update  complexity. 

-  Request  load:  The  load  for  handling  requests  for  a  DA  replica  is  the  product  of  the  total 
request  rate  at  the  replica  and  the  expected  request  complexity  for  the  DA. 

•  Requests  arc  considered  satisfied  only  if  the  returned  content  meets  or  exceeds  a  minimal 
freshness  requirement  stated  in  the  request. 

Find  an  assignment  of 

1.  DA  replicas  to  servers, 

2.  freshness  quality  levels  to  DA  replicas,  and 

3.  a  distribution  of  requests  to  replicas  that  fulfills  the  application  and  quality  demands  of  the 
requests 

such  that,  for  each  server,  the  sum  of  the  update  and  request  loads  for  DA  replicas  hosted  by  a  server 
does  not  exceed  the  server’s  load  limit. 

Figure  4:  The  Quality-Sensitive  DA  Replication  Problem  Informally  Defined 


Figure  5:  An  Application  Service  Provider  Network  Hosting  DAs  [53]. 

2.2.4  The  Quality -Sensitive  DA  Replication  Problem  Formalized.  In  this  sub¬ 
section,  the  DA  rep  problem  is  formulated  as  a  linear,  mixed-integer  minimization  problem 
which  has  been  proven  to  be  NP  complete  [53]. 
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Definitions: 


•  The  ASP  has  m  servers,  S  —  {1, ... ,  m}. 

•  Each  server  s  G  S  has  a  processing  capacity  denoted  by  Cs. 

•  The  ASP  has  n  customer-provided  DAs  to  be  hosted,  D  =  {1, . . . ,  n},  on  its  m 
servers. 

•  Eachd  G  D  operates  at  one  or  more  service  quality  levels,  Qd  =  {1, . . . ,  q, . . . ,  qmax(d)}, 
where  qmax(d)  is  the  highest  level  offered  by  DA  d. 

•  The  request  load  for  DA  d,  RLd,  is  the  sum  of  the  request  loads,  denoted  by  rld,  for 
each  of  its  quality  levels:  RLd  =  Yhq&Qd  r^d,q- 

•  For  each  service  quality  of  a  DA  d  there  is  a  certain  update  load  required  to  maintain 
that  service  quality:  ULd  =  {uld,  i, . . . ,  ld,q, . . . ,  uldiVnax^}. 

•  Let  xSdiq  G  {0, 1}  be  a  binary  variable  that  indicates  that  server  s  is  hosting  a  replica 
of  a  certain  (d,  q)  pair  where  (d,  q)  pair  is  shorthand  for  quality  q  of  DA  d.  Let 
Xs,d,q  €  [0, 1]  denote  the  fraction  of  the  request  load  of  a  (d,  q)  pair,  rldiq,  assigned  to 
server  s.  The  update  load  experienced  by  server  s  depends  on  the  quality  level  of  the 
DA  replicas  it  hosts: 

Uls  ^  ^  '  Xs,d,q  '  tlld,q-  (1) 

d(zD  q£Qd 

Server  s’ s  request  load  is  the  fraction  of  each  (d,  q)  pair’s  request  load  sent  to  it: 

rls  y  ^  y  ^  A s,d,q  '  rld^q-  (2) 

deDqeQa 

Objective  and  Constraints:  The  ASP  seeks  an  assignment  of  DA  replicas  to  servers  that 
minimizes  the  system-wide  update  burden,  UB ,  and  is  subject  to  four  constraints. 

min  UB  =  min  ZEE  Idl d,q  *  %s,d,q  (3) 

s£S  d^D  q£.Qd 

1.  Request  load  for  each  quality  of  each  DA  is  satisfied:  J2q£Qd  ^Sg>s  A SjdjQ  =  1. 
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2.  Only  one  quality  level  of  a  DA  is  hosted  by  a  server:  J2q^Qd  xs,d,q  —  1- 

3.  A  server’s  processing  capacity  cannot  be  exceeded:  uls  +  rls  <  Cs. 

4.  Requests  processed  by  a  replica  must  meet  or  exceed  the  request’s  quality  expecta¬ 
tion,  qr:  J2qs\qs,qreQdAq3>qsXs’4gs  —  ^s,d,qr- 

Although,  in  theory,  the  above  formulation  could  enable  the  ASP  to  find  optimal  DA 
assignments,  in  reality,  optimal  solutions  are  elusive  since  DA  rep  is  in  the  class  of  NP-hard 
problems,  even  if  all  the  DAs  in  an  ASP  have  only  one  freshness  quality  level,  as  shown  in 
[53], 

2.3  Other  Assignment  Problems 

Essentially,  the  Quality-Sensitive  DA  Replication  Problem  is  an  optimization  assign¬ 
ment  problem  with  many  constraints.  This  section  covers  a  couple  of  the  categorical  as¬ 
signment  problems  and  how  they  have  been  solved. 

Pairing  problems  constitute  a  vast  family  of  problems  which  deal  with  practical  de¬ 
sign  and  resource-allocation  problems.  Different  versions  of  these  problems  have  been 
studied  since  the  mid  1950s  due  both  to  their  many  applications  and  to  the  challenge  of 
understanding  their  combinatorial  nature.  Some  can  be  easily  solved  in  polynomial  time, 
whereas  others  are  extremely  difficult.  The  simplest  one  is  the  Assignment  Problem  that 
can  be  easily  solved  by  the  Hungarian  Algorithm  [61].  Others  are  much  harder,  such  as  the 
Generalized  Assignment  Problem  and  the  Quadratic  Assignment  Problem,  which  are  very 
difficult  and  NP  hard  [61]. 

2.3.1  Generalized  Assignment  Problem.  Assignment  problems  consist  of  find¬ 
ing  the  best  assignment  of  some  set  of  items  to  items  (or  agent)  of  another  disjoint  set 
according  to  some  predefined  function.  Its  many  applications  include  the  assignment  of 
tasks  to  workers,  of  jobs  to  machines,  of  fleets  of  aircraft  to  tasking  orders,  or  the  assign¬ 
ment  of  school  buses  to  routes  [1,60].  However,  in  most  practical  applications,  each  agent 
requires  a  quantity  of  some  limited  resource  to  process  a  given  job  or  has  a  limited  capac¬ 
ity  for  a  given  resource.  Therefore,  the  assignments  have  to  be  made  taking  into  account 


13 


the  resource  necessity  or  capacity  of  each  agent.  The  problem  derived  from  the  classi¬ 
cal  Assignment  Problem  by  taking  into  account  these  capacity  constraints  is  known  as  the 
Generalized  Assignment  Problem  (GAP).  Among  its  many  applications  is  the  problem  of 
assigning  variable  length  commercials  to  time  slots  [7],  jobs  to  computers  in  a  computer 
network  [7],  distribution  of  activities  to  the  different  subsections  of  a  company  when  mak¬ 
ing  a  project  plan  [69],  etc.  Besides  these  applications,  it  also  appears  as  a  subproblem  in  a 
variety  of  combinatorial  problems  like  Vehicle  Routing  [36]  or  Resource  Location  [31,61]. 
A  classic  NP-hard  problem  that  is  similar  to  the  Generalized  Assignment  Problem  is  graph 
coloring  [24,25]. 

2.3.2  Quadratic  Assignment  Problem.  The  Quadratic  Assignment  Problem  (QAP) 
is  another  classic  combinatorial  optimization  problem  and  is  widely  regarded  as  one  of  the 
most  difficult  problems  in  this  class.  It  was  first  introduced  by  Koopmans  and  Beckman  to 
solve  a  facilities  location  problem  [59].  The  problem  involves  assigning  N  facilities  to  N 
locations  so  that  the  cost  of  the  assignment,  Z ,  is  minimal.  It  can  be  defined  as  follows. 

N  N  N  N  N  N 

min  Z  =  YjYj  aijxij  +  EE  EE  fikCjiXijXki 

i=l  j=  1  i=lj=lfc=li=l 

S.t. 

N 

=  l,i  =  1,2,  ...,7V  (4) 

3= 1 

N 

J2xij  =  1,7  =  1,2,  ...,N 

i= 1 

Xij  e  (0,1  },i,j  =  1,2,..., N 

where  N  =  total  number  of  facilities,  =  fixed  cost  of  locating  facility  i  at  location  j,  fik 
=  flow  of  material  from  facility  i  to  facility  k,  Cji  =  cost  of  transferring  a  material  unit  from 
location  j  to  location  l,  and  xVJ  =  1,  if  facility  i  is  at  location  j;  0  otherwise. 

Many  solution  methods  have  been  developed  to  address  the  QAP  because  of  its  con¬ 
siderable  practical  importance  in  facility  layout,  machine  scheduling  and  other  applications. 
Even  with  fast  computers,  exact  algorithms  such  as  branch-and-bound  methods  are  only 
able  to  globally  solve  rather  small  QAPs  in  a  reasonable  amount  of  time  [34].  Therefore, 
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researchers  have  concentrated  on  developing  effective  heuristics  for  the  QAP.  The  diverse 
QAP-heuristics  are  not  examined  here.  Instead,  extensive  assessments  of  the  QAP  and  its 
associated  solution  methods  can  be  found  in  [16,34,45].  Recent  developments  in  facility 
layout  are  also  covered  in  [40,41].  Many  classic  NP-hard  problems  fall  into  the  QAP  cat¬ 
egory  such  as  bin-packing  and  the  knapsack  problem  [34].  The  DA  rep  also  resides  in  the 
QAP  category  of  assignment  problems  [53]. 

2.4  Ant  Colony  Optimization 

Despite  the  current  technology,  and  rapid  advances  in  every  field,  there  are  still  some 
problems  that  continue  to  elude  scientists.  Learning  algorithms  have  been  developed  in 
combination  with  artificial  intelligence  systems  such  as  neural  networks  to  try  and  solve 
some  of  these  problems,  but  imperfections  and  inefficiencies  in  both  the  hardware  and 
software  often  prevent  reliable  results.  Scientists,  are  now  looking  into  the  world  of  insects, 
or  swarm  intelligence  for  inspiration  for  new  methods  and  approaches  of  attacking  complex 
problems.  This  section  describes  how  one  particular  form  of  swarm  intelligence,  the  Ant 
Colony  Optimization  (ACO)  meta-heuristic  algorithm  uses  the  cooperative  nature  of  ants 
in  order  to  solve  difficult  combinatorial  optimization  problems. 

2.4.1  Ant  Algorithm  Background.  An  individual  ant  is  relatively  unintelligent, 
but  as  a  part  of  a  colony,  a  complex  group  behavior  emerges  from  the  interactions  of  indi¬ 
viduals  who  exhibit  simple  behaviors  by  themselves  [34,53].  This  phenomenon  is  indica¬ 
tive  of  all  swarm  intelligences,  where  something  is  created  that  is  greater  than  the  sum  of 
its  parts.  Using  their  social  structure  ants  are  able  to  complete  very  complex  tasks  with¬ 
out  even  knowing  of  the  existence  of  the  problem  [34, 53].  One  of  the  complex  behaviors 
that  naturally  emerges  from  the  ant  colony  is  the  ability  to  determine  the  shortest  path  be¬ 
tween  two  points.  An  important  insight  of  ant  behavior  is  that  most  communication  among 
individuals,  or  between  individuals,  is  based  on  the  use  of  chemicals,  called  pheromones, 
produced  by  the  ants.  Particularly  important  for  the  social  life  of  ants  is  the  trail  pheromone, 
a  pheromone  that  individuals  deposit  while  walking  in  search  of  food  [53].  By  sensing  the 
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level  of  pheromone  on  trails,  forager  ants  can  follow  the  path  found  by  other  ants  to  get  to 
food.  The  behavior  of  pheromone-laying  and  pheromone  following  is  the  inspiring  source 
of  ant  colony  optimization.  Initially,  all  ants  move  randomly  from  their  starting  point  in 
search  of  food,  since  there  is  no  pheromone  to  start  with,  all  ants  choose  between  paths 
with  equal  probability.  While  walking,  ants  deposit  on  the  ground  a  pheromone  trail;  when 
choosing  which  way  to  go  when  their  trail  forks,  ants  choose  with  higher  probability  those 
directions  marked  by  a  stronger  pheromone  concentration  [34,53].  However,  ants  some¬ 
times  behave  randomly  and  select  trails  with  lighter  concentrations  or  even  investigate  a 
new  trail  altogether.  This  random  behavior  promotes  the  exploration  and  discovery  of  other 
paths  which  enhances  the  chances  of  finding  the  best  solution  possible.  An  ant  continues  to 
follow  trails  until  it  reaches  its  goal  or  gets  tired.  Either  way,  each  ant  will  return  to  the  nest 
while  laying  pheromone.  The  concentration  of  the  pheromone  trail  is  directly  proportional 
to  the  impact  of  the  goal  found.  For  example,  if  the  food  item  is  highly  appetizing  and  could 
not  be  taken  by  the  single  ant,  then  a  large  amount  of  pheromone  would  be  deposited  on  the 
ants  return  trip  to  make  sure  other  ants  would  find  their  way  to  it.  Likewise,  if  the  food  item 
is  not  appetizing,  is  small  in  quantity,  or  nothing  was  found  at  all,  the  ant  would  deposit 
less  pheromone  which  would  make  other  ants  not  pay  as  much  attention  to  that  trail.  Since 
pheromone  evaporates  over  time,  trails  leading  to  a  big  reward  are  continually  reinforced, 
while  trails  leading  to  little  or  no  reward  fade  away  [53]. 

When  choosing  between  a  shorter  and  longer  trail  leading  to  the  same  goal,  those 
ants  choosing  the  shorter  branch  find  the  food  first  and  are  first  to  get  back  to  the  nest. 
Therefore,  more  ants  traverse  the  shorter  branch  and  the  pheromone  trail  on  this  branch 
will  grow  faster,  thus  increasing  the  probability  that  it  is  used  by  approaching  ants.  This 
process  of  positive  feedback  is  at  the  heart  of  the  ant  colony  behavior  that  can  very  quickly 
lead  all  the  ants  to  choosing  the  shortest  branch.  This  behavior  has  been  adapted  into  an 
algorithm  which  can  be  used  by  artificial  ants  to  find  minimum  cost  paths  on  graphs,  as 
explained  in  the  next  section. 
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2.4.2  The  Ant  Colony  Meta-heuristic.  A  colony  of  (artificial)  ants  traverse  a 
graph  where  the  graph’s  edges  (directed  or  undirected)  can  be  seen  as  the  ants  possible  trails 
and  the  graph’s  vertices  as  decision  points.  While  traversing  the  graph,  the  ant  records  its 
path  taken  and  remembers  its  cost.  Each  ant’s  path  is  a  solution  to  the  problem  and  the  more 
desirable  the  cost,  the  better  the  solution  to  the  problem.  After  each  iteration  of  finding  a 
solution,  each  ant  deposits  pheromone  on  the  edges  it  traversed.  The  amount  of  pheromone 
laid  by  each  ant  depends  on  the  desirability  of  its  solution  cost,  usually  done  by  adding 
pheromone  equal  to  the  inverse  of  the  solution  cost  (eg.  Solution  cost  =  800,  pheromone 
laid  =  1/800).  In  other  words,  edges  used  to  find  the  best  solutions  (least  cost)  are  reinforced 
with  more  pheromone  than  edges  that  are  determined  to  lead  to  worse  solutions.  With  time, 
the  pheromone  on  the  edges  evaporates,  making  undesirable  edges  less  attractive  over  time. 

At  every  vertex,  the  ant  observes  the  pheromone  levels  of  all  outgoing  edges  of  that 
vertex.  It  then,  based  on  each  outgoing  edge’s  pheromone  concentration  and  a  heuristic 
desirability,  makes  a  probabilistic  choice  about  which  edge  to  follow.  The  ants  use  both 
pheromone  (representing  past  good  solutions)  and  a  heuristic  value  (to  guide  ants  when 
little  is  known  about  an  edge’s  desirability)  to  encourage  the  exploration  of  solutions  in  the 
region  of  known  good  solutions,  but  is  still  random  enough  that  good  solutions  are  highly 
unlikely  to  go  undiscovered  [53]. 

Many  NP-complete  combinatorial  problems  have  been  attempted  with  ant  algorithms. 
These  include  the  traveling  salesman  problem  [14, 33,35],  job-shop  scheduling  [23],  graph 
coloring  [24,25],  vehicle  routing  [15,56],  adaptive  routing  in  communication  networks 
[10,  29,  30,  63,  66],  sequential  ordering  [38],  shortest  common  supersequence  [54],  and 
multidimensional  knapsack  [3],  In  each  case,  ant  algorithms  perform  as  well  or  better  than 
the  best  known  algorithms  for  solving  the  problems  above  as  a  majority  of  the  works  cited 
above  indicates. 

Typically,  ant  algorithms  consist  of  a  doubly  nested  loop.  The  outer  loop  controls  the 
number  of  iterations  (usually  called  time  steps)  executed  and  the  inner  loop  controls  each 
ant  as  it  traverses  the  graph  and  builds  a  solution.  Once  each  ant  has  completed  building 
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its  solution,  the  pheromone  on  the  graph’s  edges  are  updated  to  reflect  the  quality  of  the 
solutions  found  in  the  current  time  step  and  account  for  the  evaporation  of  pheromone  from 
infrequently-used  edges.  This  updated  graph  is  then  used  as  the  starting  graph  for  the  next 
iteration. 

According  to  Ant  Colony  Optimization  Pioneers  -  Eric  Bonabeau,  Marco  Dorigo, 
and  Guy  Theraulaz,  there  are  four  essential  elements  to  an  ant  algorithm.  This  next  section 
is  adapted  from  [9] . 


1 .  Heuristic  Desirability.  This  element  gives  the  desirability  of  moving  from  vertex  i 
to  vertex  j  based  only  on  local  information.  Heuristic  desirability  is  denoted  by  ijij. 
Since  this  element  is  problem  specific,  no  definitive  or  typical  equation  can  be  given. 
However,  for  example’s  sake,  at  least  two  TSP  algorithms  use  the  inverse  of  inter-city 
distances  as  the  heuristic.  Thus,  cities  that  are  closer  together  are  more  attractive  than 
cities  that  are  farther  apart. 

2.  Transition  Rule.  This  rule  determines  the  probability  that  an  ant  k  follows  edge  (i ,  j ) 
when  moving  from  vertex  i  to  vertex  j.  Let  J-  be  the  set  of  vertices  ant  k  can  move 
to  if  currently  at  vertex  i.  A  typical  rule  is: 


k  „>  =  MOP  ■  [>fap 

p“m  EWf-W' 

7e  lk 


(5) 


when  j  e  Jf  and  0  when  j  £  Jf.  In  the  above  equation,  Tjj(f)  is  the  pheromone 
concentration  on  edge  (i,j)  at  time  step  t.  Scaling  parameters  a  and  (3  control  the 
influence  of  pheromone  trail  ry  (f)  and  heuristic  desirability,  rjij,  respectively.  Note 

that  pkt  =  1  as  long  as  is  non-empty. 

ieJf 

3.  Constraint  Satisfaction.  This  element  ensures  that  the  solutions  found  are  feasible. 
For  example,  maintaining  a  list  of  visited  cities  ensures  that  ants  visit  each  city  exactly 
once  during  a  tour  in  the  TSP. 

4.  Pheromone  Update  Rule.  This  rule  governs  the  updating  of  pheromone  on  edges. 
Pheromone  is  deposited  on  edges  to  reflect  the  quality  of  solutions  found  by  the 
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ants.  Evaporation  of  pheromone  from  the  edges  also  occurs.  In  ant  TSP  algorithms 
the  concentration  of  pheromone  deposited  on  edges  is  inversely  proportional  to  the 
shortness  of  a  tour  or  set  of  tours,  i.e.,  edges  making  up  short  tours  receive  more 
pheromone  than  longer  tours.  Evaporation  of  pheromone  is  typically  modelled  as  a 
constant  phenomenon;  each  time  step  a  constant  fraction  of  pheromone  evaporates 
from  all  edges.  A  typical  update  rule  is: 

Tij(t  +  1)  <-  (1  -  p)  ■  Tij(t)  +  p  ■  A Tij(t)  (6) 

where  is  the  amount  of  pheromone  on  edge  (■ i,j )  at  time  step  t,  A Ty(£)  is  the 
amount  of  new  pheromone  to  be  deposited  on  edge  (i,j)  as  a  result  of  the  ants’  col¬ 
lective  activity  during  time  step  t,  and  p  determines  the  fraction  of  old  pheromone  to 
new  pheromone.  Note  that  the  determination  of  A (f)  is  implementation  specific. 
Past  researchers  have  experimented  with  A Tijit)  expressions  that  (i)  rank  the  solu¬ 
tions  of  the  k  ants  and  deposit  pheromone  on  the  edges  of  the  top  m,  1  <  m  <  k, 
solutions  proportional  to  each  solution’s  rank  [9],  (ii)  limit  the  maximum  and  mini¬ 
mum  amount  of  pheromone  on  any  edge,  and  (iii)  proportionally  reinforce  stronger 
pheromone  trails  less  than  weaker  ones. 

Pheromone  trails  can  also  be  updated  dynamically  as  the  ants  work.  For  example, 
[32]  updates  edges  each  on  each  ant’s  passing  using: 


Tij(t)  (1  ~  p)  -Tijit)  +p-T0  (7) 

where  r0  is  a  constant  amount  of  pheromone.  This  style  of  dynamic  updating  reduces 
pheromone  on  visited  edges  and  encourages  exploration  of  non-visited  edges  by  ants 
working  later  in  the  current  time  step. 

The  Ant  Colony  Optimization  algorithm  can  be  adapted  to  solve  many  types  of  op¬ 
timization  algorithms.  In  section  3.1,  it  discusses  how  ACO  was  adapted  in  order  to  solve 
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the  Quality-Sensitive  DA  Replication  Problem  problem  and  in  section  4.4,  how  this  opti¬ 
mization  technique  compares  to  other  methods  for  solving  this  problem. 

2.5  Reinforcement  Learning 

The  Win  or  Learn  Fast  Algorithm  is  a  reinforcement  learner.  Since  this  thesis  inves¬ 
tigation  centers  on  a  combination  of  ACO  and  WoLF,  this  section  reviews  reinforcement 
learning  concepts.  It  begins  with  an  introduction  to  learning  algorithms  and  their  uses.  It 
discusses  policy  hill-climbing  algorithms  and  an  in  depth  look  at  the  Win  or  Leam  Fast 
algorithm  and  how  it  has  been  applied  to  problems  and  its  effect. 

2.5.1  Reinforcement  Learning.  Reinforcement  learning  concerns  an  agent  who 
must  learn  behavior  through  trial-and-error  interactions  with  a  dynamic  environment.  The 
agent’s  job  is  to  find  a  policy  n,  mapping  states  to  actions,  that  maximizes  some  long- 
run  measure  of  performance  [43].  However,  in  order  to  find  the  policy,  the  agent  must 
first  explore  the  problem  space  and  determine,  at  each  state,  the  effect  of  choosing  each 
action  as  it  impacts  the  long-term  goal  of  finding  the  optimal  solution.  The  following  ex¬ 
ample  demonstrates  the  problem  of  exploration  versus  exploitation.  The  simplest  possible 
reinforcement-learner  problem  is  known  as  the  k- armed  bandit  problem  [64].  In  this  prob¬ 
lem,  an  agent  must  pull  one  of  k  arms  (gambling  machines)  at  each  time  step  so  as  to 
maximize  the  total  average  reward.  The  agent  is  permitted  a  fixed  number  of  pulls,  h.  Any 
arm  may  be  pulled  on  each  turn.  The  machines  do  not  require  a  deposit  to  play;  the  only 
cost  is  wasting  a  pull.  When  arm  i  is  pulled,  machine  i  pays  off  0  or  1,  according  to  some 
underlying  probability  parameter  pl,  where  payoffs  are  independent  events  and  the  p,s  are 
unknown  [43].  The  goal  of  this  problem  is  to  determine  the  best  strategy  for  the  agent  to 
take  in  order  to  obtain  the  maximum  possible  payoff. 

This  problem  illustrates  the  fundamental  tradeoff  between  exploitation  and  explo¬ 
ration.  An  agent  who  believes  that  a  particular  machine  has  a  fairly  high  payoff  probability 
could  choose  that  arm  every  time,  but  it  could  be  missing  out  on  a  better  probability  of 
winning  on  another  machine.  The  solution  to  this  problem  depends  on  the  number  of  pulls 


20 


allowed  or  how  long  the  agent  is  expected  to  play  the  game.  The  longer  the  game  lasts, 
the  worse  the  consequences  of  prematurely  converging  on  a  sub-optimal  machine,  and  the 
more  the  agent  should  explore  before  converging  on  a  solution  [43]. 

There  are  two  main  strategies  for  solving  reinforcement-learning  problems.  The  first 
is  to  search  in  the  problem  space  in  order  to  find  a  behavior  that  performs  well  in  the  prob¬ 
lem  environment.  This  approach  has  been  taken  by  those  working  in  genetic  algorithms 
and  genetic  programming,  as  well  as  some  novel  search  techniques  [62].  The  second  uses 
statistical  techniques  and  dynamic  programming  methods  to  estimate  the  utility  of  taking 
actions  in  states  in  the  world  and  then  choosing  the  best  action  based  on  the  statistics  gen¬ 
erated.  This  second  approach  is  the  basis  for  the  Win  or  Learn  Fast  algorithm  which  is 
examined  in  section  2.5.2. 

2.5.2  Win  or  Learn  Fast  Algorithm.  In  most  learning  algorithms  involving  agents, 
the  solution  space  is  seen  as  a  collection  of  state-action  pairs,  represented  by  (s,  a)  where 
s  G  S  and  a  G  A.  The  set  of  states,  S,  is  the  particular  locations/places  where  an  agent  can 
be  located  (e.g.  the  states  of  DA  rep  are  the  scrvers/(ri,  q)  pairs).  A  is  the  set  of  all  possible 
actions  or  moves  an  agent  is  allowed  to  do  when  in  a  certain  state  (e.g.  pick  a  server  to  host 
the  (d,  q)  pair  selected).  In  policy  hill-climbing  (PHC)  algorithms,  each  (s,  a)  pair  is  given 
a  policy  value  in  the  agent’s  problem  space  in  order  to  guide  an  agents  decision  making 
toward  maximizing  the  reward  (or  minimizing  the  cost)  of  the  problem  being  solved.  The 
policy  of  each  (s,  a)  pair,  7rsa,  is  updated  based  on  a  probability  that  the  action  a  taken  from 
state  s  will  lead  to  a  better  solution  [26].  Actions  with  a  high  policy  value  are  considered 
more  important  to  producing  optimal  results  and  are  more  likely  to  be  exploited  by  the 
agent  in  the  future  [11]. 

Using  the  policy  hill-climbing  algorithm,  an  agent  must  explore  the  solution  space, 
then  based  on  the  reward  (or  lack  thereof)  received  by  performing  action  a  in  state  s,  adjusts 
nsa.  Seeing  the  maximum  reward  as  getting  to  the  top  of  a  hill,  the  algorithm  climbs  toward 
the  best  reward.  The  policies  are  adjusted  by  an  amount,  <5,  which  is  referred  to  as  the 
learning  rate  or  step  size. 
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Normally,  PHC  algorithms  use  a  single  fixed  step  size,  which  for  several  reasons,  is 
not  ideal.  First,  a  single  fixed  step  size  prevents  the  algorithm  from  increasing  policies  by 
a  larger  or  smaller  amount  than  the  fixed  size  when  necessary.  Second,  it  has  been  shown 
that  in  fixed  step  size  hill-climbing  algorithms,  an  agent’s  policies  never  reaches  a  steady 
state,  or  converge  for  the  general  problem  case  [26]. 

WoLF  (Win  or  Learn  Fast)  is  a  policy  hill-climbing  method  by  Bowling  and  Veloso 
for  changing  the  learning  rate  to  encourage  convergence  in  a  multiagent  reinforcement 
learning  scenario  [12].  WoLF’s  technique  is  very  intuitive.  It  suggests  that  an  agent  should 
adapt  quickly  when  doing  more  poorly  than  expected,  and  be  cautious  when  it  is  doing 
better  than  expected  so  as  not  to  overstep  a  better  strategy.  This  approach  allows  for  con¬ 
vergence  in  an  agent’s  policies  [12]. 

The  novelty  of  WoLF  is  that  it  replaces  the  usual  single  fixed  step  size,  with  two 
learning  rates,  for  each  state-action  pair.  The  two  step  sizes  are  associated  with  the  concept 
of  winning  and  losing. 

In  WoLF,  winning  is  when  the  policy  for  a  state  action  pair  is  interpreted  as  leading 
to  an  optimal  solution.  For  state  action  pairs  that  are  considered  winning,  a  small  step  size, 
Sw,  is  used  to  update  tt  to  encourage  exploration  in  the  solution  space  around  the  winning 
state-action  pair  [12].  However,  if  a  state  action  pair  is  losing,  it  has  been  shown  to  lead 
to  a  far  from  optimal  solution.  Therefore,  a  large  step  size,  5u  is  used  to  dramatically 
increase  the  7r  of  a  losing  state-action  pair.  This  step  size  allows  the  WoLF  algorithm  to 
exploit  recent  performance  gains  uncovered  by  the  losing  state-action  pair  and  to  move 
more  quickly  towards  a  solution  of  optimal  value.  This  is  what  is  meant  by  learn  fast  [26]. 

The  impact  an  action  has  is  determined  by  combining  the  concept  of  policy  with 
search  algorithms  such  as  Policy  Hill  Climbing  [27],  Gradient  Descent  [11],  Q-Leaming 
[11,37],  or  Ant  Colony  Optimization  [26].  When  combined  with  these  algorithms,  WoLF 
uses  the  strength  of  an  adaptive  decision  policy  to  enable  agents  to  converge  more  rapidly 
to  optimal  solutions,  thus  making  these  algorithms  more  effective. 
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There  are  three  main  options  when  using  the  Win  or  Leam  Fast  algorithm  that  must 
be  manipulated  to  fit  the  problem  at  hand. 

1.  An  estimation  policy.  This  is  the  value  derived  from  the  optimization  algorithms 
discussed.  It  is  the  approximation  of  an  agents  perceived  environment  at  each  state. 
For  gradient  ascent  algorithms,  this  would  be  the  probabilities  of  an  action  being 
selected  and  the  expected  payoff  that  would  result.  In  Q-learning,  the  Q- value  can  be 
used  to  determine  an  approximation  of  an  agents  current  environment.  For  ACO,  an 
edge’s  pheromone  concentration  can  be  used  to  estimate  the  ant  colony’s  best  guess 
at  which  edges  should  be  included  in  the  optimal  solution  [26] . 

2.  A  rule  for  determining  winning  or  losing.  This  rule  determines  whether  or  not  an 
agent  is  approaching  a  local  optima  or  not.  In  [11],  Bowling  and  Veloso  introduced 
an  algorithm,  called  WoLF-PHC,  that  combined  the  WoLF  concept  with  the  policy 
hill-climbing  variant  of  Q-learning.  In  accordance  with  WoLF,  they  proposed  that  the 
value  of  5,  whether  the  (s,  a)  pair  is  winning  or  losing,  to  be  determined  by: 

{$wi  if  y  ''^(s.a')Q(x.a')  If  ( ( fi.n.') 

a'  a'  (8) 

5i,  otherwise 

where  a'  are  the  actions  available  from  state  s,  Q{s,a>)  is  an  (s,  a)  pairs  Q-value,  and 
7T(s,a)  is  the  average  of  all  H(s,a')  values.  Using  this  equation,  an  agent  is  winning  if  its 
current  policies  for  all  actions  in  state  s  have  a  greater  benefit  than  using  the  average 
policy  of  all  actions  in  state  s  [11]. 

However,  Banerjee  and  Peng,  in  their  Policy  Dynamics-Based  Win  or  Learn  Fast  Pol¬ 
icy  Hill-Climbing  (PD WoLF-PHC)  algorithm  [8],  suggest  an  alternate  definition  of 
winning  and  losing  using  the  gradient  of  the  policy.  This  definition  relies  on  keep¬ 
ing  track  of  policy  rate  of  change  (policy  velocity)  A(S)tt),  and  policy  acceleration, 
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where 


5 


Sw,  if  Avr^a)  (t)  •  A 7 r^a)  (t)  <  0 

5i,  otherwise 


(9) 


A7 T(Sia)(f)  7T(s,a)(f)  ^"(s,a)(f  1) 


(10) 


and 


A*fS,a)(*)  =  A^,a)(f)  -  A7T(a,o)(t  -  1)  (11) 

According  to  this  definition,  the  policy  for  an  (s,  a)  pair  is  winning  if: 

(a)  the  policy  value  is  increasing  (positive  An^^)  but  the  rate  of  increase  is  slow¬ 
ing  down  (negative  Att^  aj)  or 

(b)  the  policy  value  is  decreasing  (negative  An^^)  but  the  rate  of  decrease  is  slow¬ 
ing  down  (positive  Att^  a)). 

This  definition  uses  the  fact  that  policy  value  change  rates  should  slow  down  as  they 
near  their  optimums  (from  either  the  positive  or  negative  side).  Therefore,  when  the 
change  rate  slows,  the  (s,  a)  pair  is  seen  as  winning.  Otherwise,  the  edge  is  seen  as 
losing  and  needs  to  take  larger  step  sizes  (learn  faster)  in  order  to  reach  its  optimal 
value  faster.  This  definition  has  been  shown  to  converge  more  rapidly  and  require 
less  overhead  than  WoLF-PHC’s  definition  [8,26]. 

3.  Winning  and  learning  step  rates.  These  are  the  step  sizes  discussed  above,  8W,  and 
5i.  These  values  determine  the  learning  rate  and  affect  the  rate  at  which  an  agent 
will  converge  on  an  optimal  solution.  In  [26],  the  authors  suggest  a  win-to-learn  ratio 
of  1:3  for  good  performance,  however,  since  DA  rep  is  a  different  problem,  many 
different  ratios  were  experimented  with  in  order  to  determine  the  optimal  parameters 
for  WoLFAntDA(see  section  4.3. 
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WoLF  has  been  adapted  to  help  solve  many  problems  such  as  the  Traveling  Salesman 
Problem  [26]  and  stochastic  matrix  games  [8, 11, 12,27].  Section  3.2  describes  how  the 
Win  or  Leam  Fast  algorithm  has  been  combined  with  AntDA  in  order  to  solve  the  Quality- 
Sensitive  DA  Replication  Problem. 

This  chapter  described  the  Quality-Sensitive  DA  Replication  Problem  as  well  as  other 
assignment  problems  related  to  it.  The  ACO  and  WoLF  algorithms  were  covered  as  well 
and  how  they  have  been  adapted  to  fit  other  optimization  problems. 
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III.  Methodology 

This  chapter  begins  with  an  overview  of  how  the  Ant  Colony  Optimization  and  Win  or 
Learn  Fast  Algorithms  were  adapted  in  order  to  be  implemented  on  the  Quality-Sensitive 
DA  Replication  Problem.  Next,  there  is  an  introduction  to  other  solution  methods  used 
for  performance  comparison  with  the  proposed  algorithms:  AntDA,  WoLFAntDA,  and  PD- 
WoLFAntDA.  This  chapter  concludes  with  an  explanation  of  the  Server  Filling  heuristic 
which  was  combined  with  the  Ant  Colony  Optimization  algorithm  to  enhance  the  perfor¬ 
mance  of  AntDA. 

3.1  AntDA:  An  ACO  Algorithm  for  ZMrep 

The  first  proposed  algorithm,  AntDA,  is  the  adaptation  of  ACO  to  fit  the  Quality- 
Sensitive  DA  Replication  Problem.  This  section  covers  the  basic  behavior,  transition  rules, 
and  the  rules  for  depositing  pheromone  on  edges. 

3.1.1  Basic  Behavior.  In  AntDA,  ants  operate  on  a  bipartite  graph  representing 
an  instance  of  DA  rep  (Fig.  6).  The  graph,  G  =  (V,  E),  consists  of  a  set  of  vertices,  V,  and 
edges  connecting  vertices,  E.  The  vertices  are  divided  into  two  groups,  DO  and  S,  such 
that  V  =  DQ  U  S  and  DQ  fl  S  —  0.  Each  vertex  in  DQ  represents  a  (d,  q)  pair  (a  quality 
q  of  DA  d). 

The  vertices  in  S  represent  the  servers.  Each  dq  G  DQ  is  connected  to  every  s  G  S 
by  a  directed  edge  ( dq,s ).  Similarly  each  s  G  S'  is  connected  to  every  dq  G  DQ  by  a 
directed  edge  (s,  dq).  Even  though  each  edge  (dq,  s )  has  a  reverse  edge  (s,  dq),  undirected 
edges  are  not  used  since  pheromone  is  interpreted  differently  on  edges  of  type  (dq,  s )  versus 
edges  of  type  (s,  dq). 

Ants  construct  solutions  by  moving  back  and  forth  between  vertices  in  DQ  ((d,  q) 
pairs)  and  vertices  in  S  (servers)  creating  a  replica  and  assigning  dq  request  load  or  adjust¬ 
ing  the  service  quality  of  an  existing  replica  on  the  server  chosen  for  assignment. 

Ants  work  independently  (maintain  their  own  solution  spaces)  but  share  the  same 
graph.  An  ant  works  on  a  solution  until  either  server  capacity  is  exhausted  or  all  request 
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DQ  vertices  ( <d,q>  pairs) 


S  vertices  (servers) 

Figure  6:  The  bipartite  problem  graph  used  by  AntDA.  Although  all  (d,  q )  pairs  are 

connected  to  servers  via  directional  edges  and  vice-versa,  single  non-directional  edges  are 
shown  here  for  simplicity. 

load  has  been  assigned  to  the  servers.  Once  all  ants  have  solved,  they  deposit  pheromone  on 
the  shared  graph,  pheromone  evaporation  takes  place,  and  then  the  next  time  step  begins. 
Ants  are  placed  at  a  random  server  vertex  at  the  beginning  of  each  time  step.  The  algorithm 
for  AntDA  is  shown  in  algorithm  1. 

3.1.2  Moving  From  Servers  to  ( d ,  q)  pairs.  An  ant  at  vertex  s  must  decide  which 
(i d ,  q)  pair  should  be  assigned  next  (Algorithm  1,  step  14).  Let  DQks  be  the  set  of  dq  vertices 
which  are  still  capable  of  being  assigned  to  servers.  A  ( d ,  q)  pair  can  be  assigned  if: 

1.  it  has  some  amount  of  unassigned  request  load  (rem(rldtq)  >  0),  and 

2.  There  exists  a  server  s  such  that  the  net  change  in  update  load  on  s  because  of  placing 
a  replica  of  DA  d  at  quality  q  on  s  is  less  than  the  remaining  capacity  on  s:  rem(Cs )  > 
net  change  in  uls  because  of  hosting  a  replica  of  the  (d.  q)  pair. 
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Algorithm  1  The  AntDA  Algorithm 
1:  Initialize  parameter  values 
2:  for  each  edge  (dq,  s )  G  graph  G  do 

3-  T (dq,s )  TO 

4:  end  for 

5:  for  each  edge  ( s ,  dq)  G  graph  G  do 

6-  7 \s,dq )  To 

7:  end  for 

8:  for  each  time  step  t  do 
9:  Distribute  graph  G  to  all  ants 

10:  for  each  ant  k  do 

11:  =  0 

12:  Randomly  select  a  starting  server  s  G  S 

13:  while  DQk  ^  0  and  Sklq  ^  0  do 

14:  Select  c/g  G  according  to  equation  12 

15:  Select  s  G  Skdq  according  to  equation  14 

16:  Assign  dq  to  s. 

17:  Adjust  server  capacity  to  reflect  assignment. 

18:  Adjust  dq  remaining  load. 

19:  Update  DQk  and  Skq 

20:  Invoke  the  server  filling  algorithm  -  Section  3.3 

21:  end  while 

22:  end  for 

23:  //Now  all  ants  have  built  tours  for  time  step  t 

24:  for  each  ant  k  with  one  of  the  top  m  solutions  do 

25:  Update  pheromone  using  the  rules  in  Section  3.1.4 

26:  end  for 

27:  end  for 


If  DQk  =  0,  the  algorithm  terminates.  Otherwise,  the  probability  that  ant  k  selects 
edge  (s,  dq)  is  given  by: 


= 


[t W/)]q  •  [r]s,dg\13 
E  bw^)]01  •  [vsM13 

dq'&DQk 


,  when  dq  G  DQks 


when  dq  DQks 


where  TS)dq(t)  is  the  pheromone  concentration  on  (s,  dq).  The  scaling  parameters  a  and  f3 
again  control  the  relative  importance  of  pheromone  and  heuristic  desirability.  Also,  is 
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the  heuristic  desirability  of  selecting  (s,  dq)  and  is  given  by: 

uldq  ■  rem(rldq) 

Vs.dq  ~  s—\  ;  /  ;  \-  (13) 

E  uldq '  •  remyrldqf) 

cLq'eDQk 

Dividing  uld,q  ■  rem(rld)q )  by  a  server’s  remaining  capacity,  rem(Cs),  estimates  the  update 
burden  incurred  by  creating  replicas  on  servers  of  size  rem(Cs ).  Eq.  (13)  is  an  appropriate 
heuristic  since  it  prefers  (d,  q)  pairs  most  likely  to  produce  high  update  burdens  (no  matter 
which  servers  are  used).  Note  that  qs  dq  values  change  as  the  ant  constructs  its  solution. 

After  making  its  selection,  the  ant  traverses  the  edge  to  the  selected  (d,  q)  pair  and 
then  must  choose  a  new  server. 


3.1.3  Transitioning  From  (d,  q)  pairs  to  Servers.  When  at  vertex  dq.  ant  k  must 
find  a  server  to  which  the  (d,  q)  pair  represented  by  dq  will  create  a  replica  and  assign  load 
(Algorithm  1,  step  15).  Let  S%q  be  the  set  of  servers  (vertices)  upon  which  request  load  of 
the  (d,  q)  pair  represented  by  vertex  dq  can  be  assigned.  Let  rem(Cs )  represent  the  unused 
(remaining)  capacity  of  server  s.  Server  s  is  available  for  assignment  if  the  net  change  in 
update  load  on  s  caused  by  its  hosting  DA  d  at  quality  q  is  less  than  rem(Cs )  (i.e.,  s  will 
be  able  to  handle  request  load  for  the  (d,  q)  pair).  This  is  the  server  hosting  condition. 

The  probability  that  ant  k  selects  edge  (dq,  s )  is 


<„(*)  =  \ 


[rrf(?,,(f)]Q  •  [r]dq,s}13 

E  [Tdq,s'(t)]a  ■  VldqA13' 
skdq 

0, 


when  s  E 
when  s  £  S%q 


(14) 


Tdq,s(t )  is  the  pheromone  concentration  on  (dq,  s )  at  time  step  t.  Parameters  a  and  (3 
are  constants  governing  the  relative  importance  of  pheromone  to  the  heuristic  desirability, 
Vdq,s,  of  traveling  along  edge  (dq,  s): 


rem(Cs ) 

T,dq’s  E  rem(Csf) 
*'^dq 


(15) 
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where  rem(rldtq )  is  the  amount  of  request  load  for  (d,  q)  pair  yet  to  be  assigned  to  a  server. 
The  heuristic  is  based  on  the  idea  that  greedily  selecting  the  largest  server  should  reduce 
the  number  of  replicas  created  and,  thus,  update  burden  produced.  Note  that  qd/hS  values 
change  as  servers  are  assigned.  The  heuristic  in  Equation  (15)  mirrors  the  greedy  selection 
criteria  of  the  greedy  algorithm  (see  Section  3.4)  in  the  way  that  it  favors  the  selection  of 
the  server  with  the  most  remaining  capacity. 

After  selecting  edge  (dq,  s )  the  ant  moves  from  vertex  dq  to  vertex  s.  Once  at  s  the 
ant  creates  a  replica  for  the  (d,  q)  pair  and  assigns  as  much  remaining  request  load  of  the 
(d,  q)  pair,  rem{rldq),  to  s  as  possible.  If  a  replica  of  DA  d  already  exists  on  s,  then  the  ant 
adjusts  the  quality  level  of  the  replica  if  needed  (increases  the  update  load  of  the  replica). 
The  server’s  remaining  capacity,  rem(Cs),  is  decreased  based  on  the  amount  of  update  load 
and  request  load  assigned. 

After  creating  a  replica  of  DA  d  at  quality  level  q  on  server  s,  the  ant  can  attempt  to 
invoke  the  Server-Filling  (SF)  heuristic  (explained  in  section  3.3).  In  cases  where  a  replica 
of  d  does  not  use  all  the  capacity  on  its  host  server  s,  the  SF  heuristic  looks  to  assign  request 
load  of  other  qualities  of  d  to  s.  After  making  its  selection,  the  ant  traverses  the  edge  to  the 
selected  server  node  and  then  transitions  back  to  a  (d,  q)  pair  (Section  3.1.2). 

3.1.4  Pheromone  Update  Rule.  When  each  ant  has  constructed  a  solution  to 
DA  rep  it  is  time  to  deposit  pheromone  on  the  shared  graph  (Algorithm  1,  step  25).  By 
finding  a  solution,  an  ant  has  essentially  assigned  values  for  the  xSjd,q  and  Xs,d,q  variables 
described  in  the  informal  version  of  the  problem  shown  in  Fig.  4. 

Recall  that  the  DA  rep  problem’s  goal  is  to  minimize  the  amount  of  update  load, 
UL,  expended  in  a  solution  that  assigns  all  request  load.  In  other  words,  DA  rep  seeks  to 
minimize  the  following  equation: 

UL  ^  ^  ^  ^  ^  ^  111 d,q  ‘  XS)d,q-  (16) 

s£S  d£D  q£Qd 
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Quite  naturally,  the  minimization  function  of  the  formal  problem  (Eq.  16)  can  be 
used  to  rate  the  solutions  found  by  the  ants.  The  number  of  ants  allowed  to  update  edges 
and  exactly  how  much  pheromone  each  updating  ant  deposits  is  a  tunable  parameter  sub¬ 
ject  to  experimentation.  The  pheromone  deposit  scheme  which  worked  best  for  AntDA  is 
explained  in  section  4.2. 

Since  better  solutions  have  lower  update  burdens,  the  amount  of  pheromone  deposited 
by  ants  is  inversely  proportional  to  a  solution’s  update  burden.  However,  low  update  bur¬ 
dens  are  not  always  better  -  since  some  ants’  solutions  may  be  infeasible  (i.e.,  they  do 
not  assign  all  request  load).  Differentiating  between  feasible  and  infeasible  solutions  when 
deciding  how  much  pheromone  to  deposit  on  the  edges  used  in  an  ants  solution  is  easily 
handled.  Let  U Bk  ( t )  be  the  update  burden  of  ant  k' s  solution  after  time  step  t  as  computed 
by  (3).  Then  adjust  UBk(t )  to  account  for  infeasible  assignments  as  follows: 


UB'k(t) 


UBk(t) 

(  RLk(t)\u 

V  RL  ) 


(17) 


where  RLk{t )  is  the  amount  of  request  load  assigned  by  ant  k  in  time  step  t  and  u  is  a 
constant  that  determines  the  magnitude  of  the  penalty  paid  for  not  assigning  all  request 
load.  Eq.  (17)  increases  the  update  load  of  an  infeasible  assignment  based  on  how  much 
request  load  was  satisfied  raised  by  u.  Thus,  infeasible  assignments  cannot  compete  with 
feasible  ones. 

Once  U B'k(t)  has  been  determined,  it  is  used  to  calculate  the  amount  of  new  pheromone 
ant  k  will  deposit.  The  ants  with  the  m  best  solutions  are  allowed  to  deposit  pheromone  af¬ 
ter  each  time  step.  More  specifically,  if  edge  e  was  used  in  the  zth  best  solution  and  i  <  m, 
then  the  amount  of  pheromone  deposited  on  e  by  the  ant  that  produced  the  zth  best  solution 
is 

A^>  =  urn  <18) 
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where  7  is  a  constant.  For  AntDA,  7  was  set  to  1  during  experimentation  and  was  found 
to  have  little,  if  any,  impact  on  performance.  If  an  edge  e  was  not  used  by  ant  i,  then 

a i(t)  =  0. 

Let  A re(t)  =  Y^i=i  Ag(f )  be  the  amount  of  new  pheromone  to  be  deposited  on  edge 
e  because  of  the  m  solutions  chosen.  The  amount  of  pheromone  on  the  edges  in  graph  G  is 
then  updated  as  is  typically  done  in  ACO  [9] : 

Te{t  +  1)  <—  (1  ~  p)  ■  Te(t)  +  p  ■  A Te(t)  (19) 

Once  implemented,  AntDA’s  parameters  were  tuned,  the  server  filling  heuristic  added, 
and  simulations  run.  Section  4.4  describes  the  performance  of  AntDA  and  how  it  compared 
to  other  search  algorithms  in  solving  the  Quality-Sensitive  DA  Replication  Problem. 

3.2  WoLFAntDA:  A  Reinforcement  Learning  ACO  algorithm  for  DA  rep 

3.2.1  Motivation:  Why  WoLF?  Although  the  Ant  Colony  Optimization  algorithm 
has  very  good  search  capability  in  optimization  problems,  it  still  has  some  drawbacks  such 
as  stagnation,  computing  time,  and  premature  convergence.  Stagnation  and  premature  con¬ 
vergence  can  be  limited  by  tuning  of  parameters  for  each  individual  problem.  However, 
computing  time  is  due  to  random  decision  making  and  problem/graph  size  and  can  only  be 
slightly  reduced  by  parameter  tuning.  Thus,  ACO  is  not,  at  present,  an  effective  method  for 
some  problems. 

In  AntDA,  tuning  the  parameters  of  the  ACO  algorithm  allowed  for  better  solutions 
and  quicker  convergence  (see  Section  4.7.  However,  there  is  still  room  for  improvement. 
In  [26],  the  authors’  combined  a  variable-step  policy  hill-climbing  algorithm  called  Win  or 
Learn  Fast  with  an  ACO  algorithm  for  solving  the  Traveling  Salesman  problem.  They  found 
that  the  addition  of  this  learning  algorithm  provided  faster  convergence  to  optimal  solutions. 
Therefore,  in  an  attempt  to  achieve  the  goals  of  faster  convergence  and  better  solution  values 
for  larger  problems,  WoLFAntDA  and  PD-WoLFAntDA  combine  the  AntDA  algorithm 
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with  two  versions  Win  or  Learn  Fast  (WoLF-PHC  [12]  and  PDWoLF-PHC  [8]  respectively). 
These  algorithms  were  explained  in  section  2.5.2. 

3.2.2  How  the  Win  or  Learn  Fast  algorithm  was  modified  to  work  with  AntDA. 

As  described  in  section  2.5.2,  there  are  three  characteristics  that  must  be  modified  for 
each  specific  problem  in  which  it  is  applied.  The  three  options  manipulated  to  use  the 
Win  or  Learn  Fast  algorithm  (introduced  in  section  2.5.2)  in  both  WoLFAntDA  and  PD- 
WoLFAntDA  are: 

1.  An  estimation  policy.  In  AntDA,  each  ant  traverses  the  problem  graph  and  constructs 
a  solution.  Following,  the  ants  with  the  top  m  solutions  deposit  pheromone  on  the 
edges  used  to  construct  their  solutions.  Therefore,  the  pheromone  concentration  on 
an  edge  is  the  best  estimate  available  as  to  which  edges  should  be  used  to  construct 
the  optimal  solution.  Hence,  the  estimation  policy  used  in  WoLFAntDA  and  PD- 
WoLFAntDA  is  edge  pheromone. 

2.  A  rule  for  determining  winning  or  losing.  There  have  been  two  suggested  rules 
for  determining  winning  or  losing  in  the  Win  or  Leam  Fast  algorithm.  Both  of  these 
methods  have  been  described  in  section  2.5.2.  Although  it  has  been  shown  that  the 
PDWoLF-PHC  definition  of  winning  and  losing  requires  less  computation  and  mem¬ 
ory  overhead  and  converges  more  rapidly  in  other  problems  than  the  WoLF-PHC 
definition  [26],  neither  one  has  been  applied  to  DA  rep.  Therefore,  AntDA  has  been 
implemented  with  both  rules  (adapted  to  ACO)  to  determine  which  allows  for  better 
solutions  and  convergence  rates  for  this  problem.  WoLFAntDA  was  implemented 
with  the  WoLF-PHC  definition  and  PD-WoLFAntDA  with  the  PDWoLF-PHC  defi¬ 
nition.  The  implementations  are  described  in  the  following  sections  and  results  are 
shown  in  section  4.7. 

3.  Winning  and  learning  step  rates.  After  testing  multiple  values,  in  the  same  man¬ 
ner  as  in  parameter  selection,  the  step  rates  to  be  used  in  WoLFAntDA  and  PD- 
WoLFAntDA  are  5W  set  to  0.005  and  5i  set  to  0.030.  These  effect  how  the  policy 
values  for  an  edge  are  updated  (more  fully  explained  in  3.2.7). 
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Algorithm  2  The  WoLFAntDA  and  PD-WoLFAntDA  Algorithm 
1:  Initialize  parameter  values 
2:  for  each  edge  (dq,  s )  G  graph  G  do 

3-  7 (dq,s )  TO 

4:  end  for 

5:  for  each  edge  ( s ,  dq)  G  graph  G  do 

6-  7 \s,dq )  To 

7:  end  for 

8:  for  each  time  step  t  do 
9:  Distribute  graph  6'  to  all  ants 

10:  for  each  ant  k  do 

11:  =  0 

12:  Randomly  select  a  starting  server  s  G  S' 

13:  while  DQj  ^  0  and  S^q  ^  0  do 

14:  Select  c/g  G  DQs  according  to  equation  20 

15:  Select  s  G  Skq  according  to  equation  21 

16:  Assign  dq  to  s. 

17:  Adjust  server  capacity  to  reflect  assignment. 

18:  Adjust  dq  remaining  load. 

19:  Update  dq  G  DQk  and  s  G  Skq 

20:  Invoke  the  server  filling  algorithm  -  Section  3.3 

21:  end  while 

22:  end  for 

23:  //Now  all  ants  have  built  tours  for  time  step  t 

24:  for  each  ant  k  with  one  of  the  top  m  solutions  do 

25:  Update  pheromone  using  the  rules  in  section  3.2.6 

26:  end  for 

27:  for  each  ant  k  with  one  of  the  top  p  solutions  do 

28:  Update  policy  using  the  rules  in  section  3.2.7 

29:  end  for 

30:  end  for 


3.2.3  Basic  Behavior.  In  WoLFAntDA  and  PD-WoLFAntDA,  ants  still  oper¬ 
ate  on  a  bipartite  graph  representing  an  instance  of  DA  rep  (Fig.  6)  and  make  decisions 
much  as  they  do  in  AntDA.  The  main  difference  is  that  decision-making  is  now  sensitive  to 
edge  pheromone,  the  heuristic  desirability  of  the  edge,  and  edge  policy  values.  Each  edge 
(whether  server  to  (d,  q)  pair  or  (d,  q)  pair  to  server)  maintains  each  of  these  values  and 
each  are  updated  using  the  methods  described  in  the  following  sections.  The  algorithm  for 
WoLFAntDA  and  PD-WoLFAntDA  is  shown  in  algorithm  2. 
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3.2.4  Moving  From  Serx’ers  to  (d.  q)  pairs.  An  ant  at  vertex  s  must  decide  which 
(d,  q )  pair  should  be  assigned  next  (Algorithm  2,  step  14).  The  selection  of  the  ( d ,  q)  pan¬ 
to  be  assigned  is  subject  to  the  same  constraints  as  in  AntDA(see  section  3.1.2)  and  the 
decision  making  process  has  also  been  adjusted  slightly  to  allow  policies  to  affect  (d,  q) 
pair  selection. 

The  probability  that  ant  k  selects  edge  (s,  dq)  is  given  by: 


[ra,dq(.t)]a  ■[’rTs,dqd)-7ls,dq]13 

£  [rB,d9'(*)]0-[’rs,d?'W-»?s.d9']/3: 

dqf  £DQg 

0, 


when  dq  G  DQk 
when  dq  £  DQks. 


(20) 


where  rStdq(t )  is  the  pheromone  concentration  on  (s,  dq)  and  tt s,dq(t)  is  the  policy  value  on 
(s,  dq)  at  time  step  t.  The  scaling  parameters  a  and  (3  again  control  the  relative  importance 
of  pheromone  and  policy/heuristic  desirability.  Once  again,  the  heuristic  desirability  for 
the  transition  from  a  server  to  (d,  q)  pair  is  calculated  the  same  in  WoLFAntDA  and  PD- 
WoLFAntDA  as  in  AntDA  and  is  shown  in  Eq.  13. 

Equation  20  is  identical  to  Eq.  12  except  for  the  addition  of  the  policy  term,  n.  The 
introduction  of  this  term  allows  the  ants  to  be  sensitive  to  pheromone,  policy,  and  heuristic 
values  when  selecting  a  (d,  q)  pair. 

After  making  its  selection,  the  ant  traverses  the  edge  to  the  selected  (d,  q)  pair  and 
then  must  choose  a  server  node. 


3.2.5  Transitioning  From  (d,  q)  pairs  to  Servers.  When  at  vertex  dq.  ant  k  must 
select  a  server  to  which  the  (d,  q)  pair  represented  by  dq  will  create  a  replica  and  assign 
load  (Algorithm  2,  step  15).  This  selection  of  the  server  follows  the  AntDA  method  but 
with  slight  changes  that  allow  policies  to  affect  server  selection. 
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The  probability  that  ant  k  selects  edge  (dq,  s )  is 


[Tdq,s(t)]a-[Kdq,s(t)-ndq,s]P 


,  when  s  G  S' 


dq 


'<ES$ 

aq 


0, 


when  s  £  Skdq. 


(21) 


where  rdq)S(t)  is  the  pheromone  concentration  on  (dq,  s )  at  time  step  t  and  7 Tdq,s(t)  is  the 
policy  value  on  (dq,  s )  at  time  step  t.  a  and  f3  are  constants  governing  the  relative  im¬ 
portance  of  pheromone  to  the  policy/heuristic  desirability  of  traveling  along  edge  (dq,  s). 
The  heuristic  desirability  is  calculated  the  same  in  WoLFAntDA  and  PD-WoLFAntDA  as 
in  AntDA  and  is  shown  in  Eq.  15. 

Equation  21  is  identical  to  Eq.  14  except  for  the  addition  of  the  policy  term,  ir.  The 
introduction  of  this  term  makes  the  ants  to  be  sensitive  to  pheromone,  policy,  and  heuristic 
values  when  selecting  a  server.  Early  in  the  algorithm,  when  pheromone  and  policies  are 
fairly  neutral,  ants  are  guided  by  the  heuristics,  77.  However,  as  pheromone  and  policies 
become  more  differentiated,  ant  behavior  becomes  more  dependent  on  them  [26] . 

After  selecting  edge  (dq,  s )  the  ant  moves  from  vertex  dq  to  vertex  s  and  updates  its 
graph  in  the  same  manner  as  AntDA.  It  then  transitions  back  to  a  dq  node  (Section  3.2.4 


3.2.6  Pheromone  Update  Rule.  When  each  ant  has  constructed  a  solution  to 
DA  rep  it  is  time  to  deposit  pheromone  on  the  shared  graph  (Algorithm  2,  step  25).  For 
WoLFAntDA  and  PD-WoLFAntDA,  this  process  is  done  in  the  exact  manner  as  AntDA 
which  is  described  in  section  3.1.4. 


3.2.7  Policy  Updates.  After  all  ants  have  constructed  solutions  and  pheromone 
has  been  deposited  on  the  shared  graph,  it  is  time  to  update  policy  values  for  the  edges  (Al¬ 
gorithm  2,  step  28).  To  regulate  the  policy  update  for  these  edges,  the  following  equations 
were  used.  This  section  has  been  adapted  from  [8,11]  where  a  similar  approach  was  used  to 
merge  WoLF  with  ACS-TSP  (an  ACO  algorithm)  to  solve  a  Traveling  Salesman  Problem 
in  [26].  The  variables  used  for  policy  update  depend  on  what  kind  of  edge  is  being  updated. 
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3.2.7. 1  Server  to  ( d,q )  pair.  The  first  step  (Equations  22  and  23)  is  to 

determine  whether  the  particular  edge  is  winning  or  losing.  Equation  22  is  the  WoLF-PHC 
definition  [12]  for  winning  and  losing  while  equation  23  is  the  PDWoLF-PHC  definition  [8]. 
In  section  2.5.2,  these  two  different  rules  were  discussed  in  depth.  The  equations  remain 
similar  to  those  of  [12]  and  [8]  except  for  a  minor  notation  change.  Both  of  these  equations 
originally  used  the  idea  of  state/action  pairs  (explained  in  section  2.5.2),  but  this  concept 
has  been  adapted  to  fit  the  pheromone  and  graph  nature  of  AntDA.  In  WoLFAntDA  and 
PD-WoLFAntDA,  for  an  ant  traversing  from  a  server  to  (d,  q)  pair,  the  state  is  the  server, 
s,the  ant  has  currently  chosen,  and  the  action  is  one  of  the  (d,  q)  pairs,  dq,  capable  of  being 
hosted  by  the  server  s.  These  equations  reveal  the  learning  rate,  5,  for  the  given  state/action 
pair.  The  learning  rate  for  WoLFAntDA  is  given  by: 

{$w,  if  E  E  n  s,dq')'T(s,dql) 

dq'&DQ3  dq'&DQs  (22) 

Si,  otherwise 

While  the  learning  rate  for  PD-WoLFAntDA  is  given  by: 

§  —  f  ^ W ’  ^  A7r(M<?')  (0  '  A7T(s,dq')  (0  <  0  ^23) 

I  Si,  otherwise 

where 

(s,dq')  (f )  ^(s,dq'){l  1)  (24) 


and 


A7r(W)  W  =  A7T(s4q')  {t)  -  A*{s,dq>)  (t  -  1)  (25) 

Once  an  edge’s  learning  rate  is  discovered,  a  policy  change  value  must  be  calculated, 
which  is  given  by  equations  26  and  27. 
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5(s,dq)  =  min 


(26) 


|7r(-,d9) 


_ _ _ \ 

’  \DQ\-1) 


s,dq ) 


-6(s,dq)),  d(l  argma xieDQ  tm 

S  d(l,dq))%  otherwise 
les 


(27) 


Once  the  change  in  policy  is  determined,  it  is  used  to  calculate  the  new  policy  value 
for  the  edge  by  adding  the  change  in  policy  to  the  current  policy  value. 


T  1)  T  s,dq )  (28) 

The  ants  with  the  p  best  solutions  were  allowed  to  update  policy  on  the  edges.  Each 
of  these  ants  computes  ^(s,dq)  and  then  the  sum  of  all  of  them  is  used  in  equation  28.  p  is 
a  tunable  parameter  subject  to  experimentation.  The  parameter  values  that  worked  best  for 
WoLFAntDA  and  PD-WoLFAntDA  are  explained  in  section  4.3. 

3. 2.7.2  ( d,q )  pair  s  to  Servers.  Once  policy  has  been  updated  on  the 

server  to  (d,  q)  pair  edges,  policy  is  then  updated  on  the  (d,  q)  pair  to  server  edges.  This  is 
accomplished  in  the  same  manner  as  section  3.2.7. 1  except  that  instead  of  using  the  edges 
going  from  servers  to  (d,  q)  pairs  (7 r(s,<*g)),  the  edges  traversing  from  (d,  q)  pairs  to  servers 
(7r (dq,s))  are  used. 

3.3  The  Server-Filling  Replica  Creation  Heuristic 

In  the  AntDA,  WoFFAntDA,  and  PD-WoFFAntDA  algorithms  described  previously 
in  this  chapter,  ants  simply  create  replicas  or  adjust  the  update  loads  of  existing  replicas 
(when  the  replica  is  already  hosting  a  lower  quality  replica  of  application  d)  as  they  make 
assignments.  However,  in  cases  where  a  replica  of  DA  d  does  not  use  all  the  capacity  on 
its  host  server  s,  it  may  be  possible  to  assign  request  load  of  other  qualities  of  d  to  s.  If 
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additional  request  load  can  be  assigned  to  s,  then  update  burden  on  s  can  be  further  utilized 
and,  in  turn,  system- wide  update  burden,  U B ,  can  be  kept  low. 

AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  do  this  using  the  Server  Filling  heuristic 
(SF).  It  can  be  invoked  when  the  following  two  conditions  are  met. 

1.  SF  first  tries  to  avoid  the  creation  of  extra  replicas  of  d  by  finding  other  qualities  of  d 
that  completely  fit  on  s.  More  specifically,  SF  looks  for  another  quality  r  e  Qd  such 
that  all  of  rem(rld,r)  can  be  assigned  to  s.  Note  that  update  load  differences  have 
to  be  accounted  for  since  it  may  be  that  r  >  q  and  hence  uld,r  >  uldtq.  SF  assigns 
the  highest  r  found,  repeating  with  additional  qualities  of  d  if  possible.  Let  y  be  the 
highest  quality  of  d  assigned  to  s  at  the  end  of  this  step. 

2.  If  s  still  has  spare  capacity  after  step  1,  SF  looks  for  the  highest  quality  y  of  d  such 
that  u  <  y  and  assigns  as  much  request  load  of  quality  u  as  possible  to  the  replica. 

The  SF  heuristic  is  an  optional,  but  beneficial,  part  of  AntDA,  WoLFAntDA,  and 
PD-WoLFAntDA;  in  experiments  SF  reduced  update  burden  by  over  4%  on  average  (see 
Section  4.5). 

3.4  Other  Solution  Methods  Used  for  Comparison 

To  show  the  worth  of  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA,  they  must  be 
shown  to  perform  better  than  other  solution  methods  that  have  historically  been  used  for 
assignment  optimization  problems.  In  the  next  chapter,  these  three  algorithms’  results 
are  compared  against  three  algorithms  adapted  to  fit  the  Quality-Sensitive  DA  Replica¬ 
tion  Problem.  Results  for  performance  comparisons  were  obtained  by  using  a  random  as¬ 
signment  algorithm.  Random,  a  greedy  algorithm.  Greedy,  and  the  LINGO  Integer  Linear 
Programming  (ILP)  solver  [50]. 

Random  picks  a  (d,  q)  pair  with  non-zero  remaining  request  load  at  random  and 
assigns  it  to  a  random  server  capable  of  hosting  it.  All  selections  are  made  using  a  uniform 
distribution.  Random  reports  the  best  solution  found  out  of  1000  trials. 
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The  Greedy  algorithm  [53]  makes  assignments  by  choosing  the  (d,  q)  pair  with  the 
highest  predicted  update  burden.  It  does  this  by  the  following  algorithm: 

Algorithm  3  The  Greedy  Algorithm  for  DA  rep 
1:  Sort  the  set  of  capacitated  servers  by  capacity  and  store  the  results  in  a  data  structure 
5. 

2:  Set  rem{rld^q )  =  rldtq  for  each  (d,  q)  pair. 

3:  Let  C'max  denote  the  capacity  of  the  server,  smax,  with  the  most  remaining  capacity  in 
S.  Choose  the  (d,  q)  pair  to  be  assigned  to  smax 
4:  if  creating  a  replica  of  the  chosen  (d,  q)  pair  on  smax  means  that  smax  has  no  room  left 
over  for  handling  requests  (Cmax  <  uld,q)  then 
5:  remove  smax  from  S  and  go  to  step  19. 

6:  end  if 

7:  For  the  (d,  q)  pair  selected  in  Step  2,  decide  the  replica’s  quality  repQ  ( repQ  >  q) 

8:  Then,  decide  the  amount  of  request  load  for  any  additional  qualities  of  d,  r  e  Qd,  to 
be  carried  by  the  replica  using  either  the  Server  Filling  replica  creation  policy  (Section 
3.3). 

9:  Record  repQ  and  the  amount  of  request  load  of  each  r  e  Qd  assigned  to  smax. 

10:  Decrement  rem(rld  r )  for  each  r  €  Qd  by  the  amount  assigned  to  smax. 

11:  Decrement  Cmax  by  the  replica’s  update  load  and  the  sum  of  the  request  loads  assigned. 
12:  if  rem(rld,q)  =  0  for  all  (d,  q)  pairs  then 
13:  STOP  with  a  complete  solution. 

14:  end  if 

15:  if  C'max  =  0  then 
16:  remove  smax  from  S. 

17:  end  if 

18:  Resort  S  if  needed. 

19:  if  S  =  0  then 

20:  STOP  with  a  partial  solution. 

21:  else 

22:  go  to  step  3. 

23:  end  if 


Greedy  only  needed  to  be  run  once  for  its  best  solution  to  be  found. 

LINGO  ILP  Solver  solves  DA  rep  using  the  ILP  formulation  in  section  2.2.4.  Al¬ 
though  ILP  solvers  such  as  LINGO  are  the  only  known  method  besides  complete  enu¬ 
meration  that  can  find  guaranteed  optimal  solutions,  execution  times  can  be  prohibitive. 
Therefore,  LINGO  was  only  used  on  the  small  test  cases  and  allotted  four  hours  to  work 
on  the  DA  rep  problems.  This  was  sufficient  time  for  LINGO  to  product  feasible,  but  not 
optimal,  solutions  and  provides  a  notion  of  DArep’s  complexity. 
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All  three  of  the  assignment  algorithms  presented  above  are  static.  For  more  informa¬ 
tion  on  some  dynamic  assignment  algorithms  adapted  for  DA  rep,  as  well  as  more  detailed 
explanations  of  these  static  algorithms,  see  [53]. 

Chapter  IV  presents  the  results  of  experiments  that  compare  AntDA  with  these  solu¬ 
tion  methods,  reveal  the  importance  of  the  Server-Filling  heuristic,  and  the  importance  of 
pheromone  and  heuristics  on  ants  traversing  the  bipartite  graph.  It  also  demonstrates  the 
effects  of  the  Win  or  Leam  Fast  algorithm  combined  with  AntDA  compared  with  AntDA 
alone. 
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IV.  Results  and  Analysis 

This  chapter  evaluates  the  performance  of  the  proposed  AntDA,  WoLFAntDA,  and  PD- 
WoLFAntDA  algorithms  and  is  divided  into  six  main  sections.  Section  4.1  discusses  the 
configuration  of  the  experiments  used  for  analysis.  The  second  and  third  sections  describe 
how  parameter  values  are  chosen  and  which  values  are  used  for  both  AntDA  and  WoL¬ 
FAntDA.  Section  4.4  demonstrates  the  performance  of  AntDA  as  compared  with  other  so¬ 
lution  methods.  Sections  4.5  and  4.6  show  the  effect  of  the  Server  Filling  optimization 
heuristic  and  of  limiting  the  number  of  ants  allowed  to  deposit  pheromone  has  on  the  three 
proposed  algorithms.  The  chapter  concludes  with  a  performance  analysis  of  WoLFAntDA 
and  PD-WoLFAntDA  versus  AntDA. 

4.1  Explanation  of  Test  Cases 

Each  experiment  involves  a  hypothetical  ASP  with  a  variety  of  server  capacities  and 
customer  DAs.  The  DAs  are  designed  to  subject  the  algorithms  to  extremes  that  might  be 
found  in  a  real  world  environment.  The  test  cases  were  derived  from  [53]. 

In  each  experiment,  DAs  have  the  same  number  of  service  quality  levels  (either  1,  2, 
or  3)  and  have  a  particular  update  load  (UL)  pattern  and  request  load  (RL)  pattern.  Table  2 
describes  the  parameters  used  in  constructing  ASPs  for  the  experiments. 

The  UL  pattern  determines  the  update  load  values  that  the  freshness  quality  levels  of 
the  DA  can  assume.  There  are  two  patterns:  low  and  high.  Update  loads  for  DAs  always 
increase  with  quality  level.  The  low  UL  pattern  ensures  that  all  qualities  of  all  the  DAs  can 
fit  on  any  of  the  ASP’s  servers.  However,  update  loads  in  the  high  UL  pattern  are  boosted 
so  that  some  servers  will  not  be  able  to  host  high-quality  replicas  of  some  DAs. 

The  RL  pattern  determines  how  user  request  loads  change  with  quality  level  for  the 
DAs  an  ASP  is  hosting.  There  are  two  patterns:  increasing  and  decreasing.  For  the  decreas¬ 
ing  RL  pattern,  request  loads  are  large  for  low  quality  levels  and  decrease  as  quality  levels 
rise.  For  the  increasing  RL  pattern,  request  loads  start  small  and  increase  with  quality  level. 
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Parameter 

Parameter  Option 

Option  Description 

Number  of  Qualities  Per  DA 

1 

All  the  DAs  hosted  by  the  ASP  have  1  service  quality 
level. 

2 

All  the  DAs  in  an  ASP  have  2  service  quality  levels. 

3 

All  the  DAs  in  an  ASP  have  3  service  quality  levels. 

Update  Load  (UL)  Pattern 

low 

Any  server  can  host  any  DA. 

high 

The  maximum  update  load  can  be  above  the  capacity 
limit  of  the  smallest  servers. 

Request  Load  (RL)  Pattern 

decreasing 

Request  load  decreases  as  a  DB’s  quality  levels  in¬ 
crease.  Since  higher  service  qualities  require  more 
maintenance  and  probably  account  for  declining  per¬ 
centages  of  demand,  this  pattern  is  most  likely  pre¬ 
dominant  in  the  real  world. 

increasing 

decreasing’s  opposite  -  the  higher  the  service 
quality,  the  higher  the  request  load. 

Table  2:  Parameters  Used  in  Constructing  ASPs  for  the  Static  Experiments. 


#  DAs 

Per 

ASP 

#  Quals 
Per 

DA 

UL  Pat 

For  E; 
Option 

tern  anc 

ich  DA 

1 

1  Value  R 
Quality  I 
2 

.anges 

_evel 

3 

RL 

F< 

Option 

,  Pattern  an 

ar  Each  DA 

1 

d  Value  Rai 
Quality  Le 
2 

tges 

:vel 

3 

5 

1 

low 

1-5 

- 

- 

deer 

400-600 

- 

- 

5 

1 

high 

5-35 

- 

- 

deer 

400-600 

- 

- 

5 

2 

low 

1-7 

9-15 

- 

deer 

300-400 

100-200 

- 

5 

2 

high 

5-18 

22-35 

- 

incr 

100-200 

300-400 

- 

5 

3 

low 

1-4 

6-9 

11-15 

deer 

233-300 

133-200 

34-100 

5 

3 

high 

5-14 

16-25 

27-36 

incr 

34-100 

133-200 

233-300 

10 

3 

low 

1-4 

6-9 

11-15 

deer 

233-300 

133-200 

34-100 

10 

3 

high 

5-14 

16-25 

27-36 

incr 

34-100 

133-200 

233-300 

20 

3 

low 

1-4 

6-9 

11-15 

deer 

233-300 

133-200 

34-100 

Table  3:  How  ASPs  Were  Constructed  for  the  Static  Experiments.  This  table  shows  how 
the  parameters  of  Table  2  were  combined  to  form  ASPs  for  the  static  experiments.  Value 
ranges  for  update  loads  and  request  loads  are  listed. 


Each  experiment  is  run  on  a  Intel  Xeon  CPU,  3.20  GHz  processor  with  3.75  GB 
of  RAM.  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  were  all  written  in  Java  and  run  in 
Netbeans  4.0  development  environment. 

Table  3  shows  how  these  parameters  were  combined  to  produce  the  test  cases  for 
the  static  experiments.  For  each  combination  of  parameters  (UL  pattern,  RL  pattern,  and 
#  of  qualities/DA)  five,  ten,  or  twenty  ASPs  were  randomly  created.  For  example,  the 
five  2-quality/low/decreasing  ASPs  (third  row  of  data  in  Table  3)  shows  that  DA  update 
loads  in  these  type  of  ASPs  range  from  1-7  for  quality  1  and  9-15  for  quality  2,  while 
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request  loads  range  from  300-400  for  quality  1  and  100-200  for  quality  2.  Five  test  cases 
were  also  generated  for  a  20  DA,  3  quality,  increasing  RL  pattern,  high  UL  pattern  but  the 
hardware  the  tests  were  conducted  on  was  insufficient  to  compute  these  cases  due  to  its 
high  complexity  levels. 

Once  an  ASP’s  DAs  are  assigned,  the  ASP’s  servers  were  determined  by  growing  a 
candidate  set  of  servers.  Initially  empty,  servers  are  added  to  the  candidate  set  in  groups  of 
five.  The  servers  in  each  group  have  the  following  load  capacities:  25,  50,  75,  100,  125. 
This  distribution  of  server  capacities  is  intended  to  model  an  ASP  with  a  variety  of  servers. 
Groups  are  added  to  the  candidate  set  until  the  Greedy  algorithm,  using  the  Server  Filling 
policy  (see  section  4.5),  produced  feasible  assignments.  These  test  cases  were  then  solved 
by  Random,  Greedy,  and  the  LINGO  ILP  Solver  as  well  as  AntDA  and  WoLFAntDA.  The 
results  are  shown  and  discussed  in  sections  4.4  and  4.7. 

4.2  Parameter  Selection  for  AntDA 

The  main  task  after  implementing  AntDA  is  to  find  the  best  parameter  values  in  order 
to  optimize  them  for  solving  the  Quality-Sensitive  DA  Replication  Problem.  The  param¬ 
eter  values  are  determined  through  trial  and  error,  testing  many  different  values  for  each 
parameter  subjected  to  50  trials  of  400  time  steps  each.  Since  there  are  many  parameters, 
trying  every  possible  combination  of  them  would  be  infeasible.  Therefore,  to  determine 
parameter  values,  one  parameter  is  chosen  as  the  variable  to  be  tested  and  all  others  are 
held  constant.  Table  4  shows  each  parameter  and  the  values  examined/tested  for  each.  The 
numbers  in  bold  are  the  values  for  parameters  while  they  are  being  held  constant.  Each 
test  case  was  run  using  the  same  problem  instance  to  maintain  consistency.  The  problem 
instance  is  one  of  the  five  constructed  from  line  ten  of  Table  3  (#  nodes  =  70)  because  it 
was  the  hardest  problem  available  that  could  be  solved  in  a  manageable  amount  of  time 
by  AntDA  and  LINGO  (though  LINGO  only  produced  feasible  but  not  optimal  solutions). 
After  going  through  this  process,  the  best  value  for  each  parameter  was  used  in  combina¬ 
tion  with  each  other  to  verify  that  performance  did  not  degrade  (it  didn’t).  The  best  values 
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Parameter 

Parameter  Values  Tested 

a 

0,1,3,5,7,8,10 

P 

0,1,3,5,7,8,10 

P 

0.1,  0.2,  0.3, 0.4,  0.5,  0.6,  0.7,  0.8,  0.9 

UJ 

1,2,  3,  4, 5,  6,  7,  8, 9 

7 

0.01,0.1,1,10,100 

Number  of  Ants 

35, 42, 49,  56, 63,  70,  77,  84,  91,  98, 105 

To 

0.0000001, 0.000001, 0.00001,  0.001,  0.1, 1.1,  5, 10, 100,  500, 1000 

m 

1,  3,  7, 14,  21,  28, 35, 42, 49,  56,  63,  70 

Table  4:  Parameter  and  condition  values  tested  for  AntDA  experiments. 


Parameter 

Parameter  Value 

Parameter  Description 

a 

1 

Pheromone  weighting. 

P 

8 

Heuristic  weighting. 

P 

0.8 

New  to  old  pheromone  ratio. 

c 0 

4 

Non-feasible  solution  penalty  constant. 

7 

1 

Pheromone  change  constant. 

Number  of  Ants 

\DQ\  +  \S\ 

The  number  of  ants. 

To 

0.1 

Initial  edge  pheromone. 

m 

[#  Ants  •  0.1J 

The  top  m  ants  are  allowed  to  deposit 
pheromone. 

Table  5:  Parameter  and  condition  values  for  AntDA  experiments. 


identified  by  this  selection  process  are  shown  in  table  5  and  are  used  throughout  the  AntDA 
experiments  presented  hereafter  unless  otherwise  stated. 

For  the  most  part,  AntDA  was  fairly  insensitive  to  a  change  in  the  values  shown  in 
Table  5.  However,  the  one  parameter  that  had  a  major  impact  was  the  number,  m ,  of  ants 
that  deposit  pheromone  at  the  end  of  each  time  step  (Sections  3.1.4).  For  AntDA,  setting  m 
to  be  the  top  10%  of  the  number  of  ants  cut  the  convergence  rate  by  as  much  as  4.5  times 
compared  to  allowing  all  ants  to  deposit  pheromone  while  also  reducing  update  burdens. 
The  impact  of  setting  m  to  the  top  10%  of  the  number  of  ants  is  presented  in  section  4.6. 


4.3  Parameter  Selection  for  WoLFAntDA  and  PD-WoLFAntDA 

After  implementing  WoLFAntDA  and  PD-WoLFAntDA  the  parameters  needed  to 
be  tuned  for  solving  the  Quality-Sensitive  DA  Replication  Problem.  The  parameter  values 
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Parameter 

Parameter  Values  Tested 

a 

0,1,3,5,7,8,10 

(3 

0,1,3,5,7,8,10 

m 

1,  3,  7, 14,  21,  28,  35, 42, 49,  56, 63,  70 

P 

1,3,7, 14,21,28,35 

Si 

0.005,  0.01,  0.015,  0.02,  0.025,  0.03,  0.035 

Sw 

0.005,  0.01,  0.015,  0.02,  0.025,  0.03,  0.035 

Table  6:  Parameter  and  condition  values  tested  for  the  WoLFAntDA  and  PD- 

WoLFAntDA  experiments. 


Parameter 

Parameter  Value 

Parameter  Description 

a 

1 

Pheromone  weighting. 

(3 

8 

Heuristic  weighting. 

P 

0.8 

New  to  old  pheromone  ratio. 

u> 

4 

Non-feasible  solution  penalty  constant. 

7 

1 

Pheromone  change  constant. 

Number  of  Ants 

\DQ\  +  \S\ 

The  number  of  ants. 

A) 

0.1 

The  amount  of  pheromone  initially  on  each 
edge  in  the  graph. 

m 

[#  Ants  •  0.2J 

The  top  m  ants  are  allowed  to  deposit 
pheromone. 

P 

[#  Ants  •  0.1J 

The  top  p  ants  are  allowed  to  update  policy. 

Si 

0.03 

Losing  step  size. 

Sw 

0.005 

Winning  step  size. 

Table  7 :  Parameter  and  condition  values  for  the  WoLFAntDA  and  PD-WoLFAntDA  ex¬ 
periments. 


were  determined  in  the  same  manner  as  AntDA  with  the  exception  that  there  are  a  few  new 
variables:  5i,  8W,  and  p.  The  main  focus  was  on  tuning  these  new  parameters,  but  tests 
were  also  run  for  a,  /3,  and  m  to  determine  their  best  values  for  WoLFAntDA  and  PD- 
WoLFAntDAThrough  this  process,  unless  otherwise  stated,  these  two  algorithms  are  run 
with  the  parameter  values  and  conditions  shown  in  Table  5. 

WoLFAntDA  and  PD-WoLFAntDA  are  also  fairly  insensitive  to  a  change  in  the  val¬ 
ues  shown  in  Table  5.  However,  just  as  in  AntDA,  the  parameter  that  seemed  to  have  the 
biggest  impact  was  the  number  of  ants  allowed  to  change  edge  pheromone  values.  The 
impacts  of  this  parameter,  m,  are  presented  in  section  4.6. 
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4.4  Comparison  of  AntDA  To  Other  Optimization  Algorithms 

Tables  8  and  9  show  the  performance  of  AntDA  and  the  other  solution  methods  for 
forty-five  test  cases.  Each  row  of  the  tables  represents  one  test  case  while  columns  group 
each  solution  method.  The  Random  column  shows  the  lowest-cost  solution  produced  over 
1000  executions  of  the  Random  algorithm.  For  the  ant-based  results,  the  minimum,  max¬ 
imum,  average,  and  standard  deviation  of  the  fifty  solutions  for  each  test  case  are  shown. 
Recall  that  AntDA  is  run  50  times  for  each  test  case  and  that  each  running  is  for  400  time 
steps.  The  lowest-cost  solutions  for  each  test  case  are  shown  in  bold  typeface. 

AntDA  found  the  solution  with  the  lowest  update  burden  in  all  but  three  test  cases. 
Also,  in  all  but  two  cases,  the  solution  with  the  maximum  update  burden  found  by  AntDA  is 
better  than  the  minimum  update  burden  found  by  the  Random  and  Greedy  solution  methods. 

Clearly,  AntDA  produces  better  solutions  than  the  three  other  methods.  However, 
AntDA  has  higher  solution  times  than  the  other  methods.  For  example,  in  the  5  DA,  3 
quality,  increasing  RF  pattern,  high  UF  pattern  experiments,  the  Greedy  algorithm  can 
produce  a  solution  in  milliseconds,  the  Random  algorithm  needed  about  1.5  minutes,  and 
FINGO  was  cut  off  after  two  weeks.  Yet,  on  the  same  hardware,  AntDA  requires  an  average 
of  7.2  minutes  to  complete  the  400  time  steps  and  produce  a  single  solution.  Since  AntDA 
was  run  50  times,  its  run-time  was  close  to  6  hours.  For  more  complex  problems,  AntDA 
took  as  long  as  a  week  to  run  through  50  times.  However,  400  time  steps  and  50  runnings 
is  being  overly  thorough.  Reducing  the  number  of  time  steps  would  allow  for  much  faster 
results.  The  number  of  time  steps  (or  convergence  rate)  necessary  to  AntDA  find  the  best 
solution  is  presented  in  section  4.7. 

4.5  Effect  of  the  Server  Filling  Algorithm 

This  section  highlights  the  impact  of  the  Server-Filling  (SF)  optimization  heuristic. 
Figure  10  shows  the  minimum  update  burdens  produced  by  AntDA,  WoFFAntDA,  and  PD- 
WoFFAntDA  with  and  without  the  Server-Filling  heuristic. 
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Quals 

Solution  Cost  (Update  Burden) 

# 

Per 

RL 

UL 

AntDA 

# 

DA 

DA 

Patt 

Patt 

# 

Random 

Greedy 

LINGO 

min 

max 

avg 

stdev 

Ants 

1 

305 

258 

246+ 

239 

249 

245.68 

1.67 

45 

2 

254 

225 

203+ 

203 

207 

203.72 

0.97 

45 

5 

1 

n/a 

low 

3 

298 

243 

231+ 

231 

231 

231 

0.0 

50 

4 

379 

323 

313+ 

305 

314 

308.22 

2.40 

50 

5 

351 

307 

292+ 

284 

292 

287.06 

2.27 

45 

1 

930 

860 

821  + 

796 

821 

799.16 

4.70 

60 

2 

698 

659 

656+ 

645 

647 

645.24 

0.66 

55 

5 

1 

n/a 

high 

3 

895 

894 

856+ 

856 

945 

916 

24.16 

55 

4 

998 

983 

964+ 

953 

1009 

976.1 

19.48 

55 

5 

810 

761 

710+ 

708 

727 

713.06 

4.25 

55 

1 

259 

211 

206* 

196 

201 

197 

1.81 

50 

2 

188 

164 

166* 

160 

160 

160 

0.0 

50 

5 

2 

deer 

low 

3 

271 

226 

230 

217* 

220 

217.78 

1.25 

55 

4 

169 

157 

156* 

155 

155 

155 

0.0 

50 

5 

230 

194 

193* 

187 

188 

187.02 

0.14 

50 

1 

1070 

890 

829* 

838 

850 

849.12 

2.39 

65 

2 

990 

909 

831* 

838 

850 

844.8 

3.11 

60 

5 

2 

incr 

high 

3 

999 

819 

781* 

786 

813 

796.74 

9.17 

65 

4 

1167 

957 

1002* 

858 

860 

858.08 

0.40 

65 

5 

974 

809 

832* 

720 

728 

722.58 

2.89 

65 

1 

242 

206 

237++ 

178 

179 

178.02 

0.14 

55 

2 

220 

186 

215++ 

158 

159 

158.04 

0.20 

55 

5 

3 

deer 

low 

3 

186 

155 

166++ 

154 

155 

154.07 

0.27 

55 

4 

177 

151 

158++ 

142 

142 

142.00 

0.00 

55 

5 

196 

171 

176++ 

145 

147 

146.40 

0.57 

55 

1 

1057 

842 

961** 

784 

800 

793.86 

5.77 

70 

2 

1135 

884 

940** 

811 

824 

817.94 

2.94 

70 

5 

3 

incr 

high 

3 

1048 

788 

907** 

764 

771 

766.06 

2.78 

70 

4 

1099 

849 

885** 

813 

822 

818.98 

2.02 

75 

5 

1137 

867 

913** 

811 

823 

814.64 

2.60 

70 

Table  8:  Comparison  of  AntDA  to  other  search  algorithms  for  5  DA  problems.  Depend¬ 
ing  on  the  Problem,  LINGO  was  let  run  for  differing  amounts  of  time:  (+)  =  1  hour,  (*)  = 
2  hours,  (++)  =  4  hours,  and  (**)  =  300  hours.  The  lowest-cost  solutions  for  each  test  case 
are  shown  in  bold  typeface. 


Server-Filling  experimental  results  are  shown  for  just  five  test  cases  (hypothetical 
DA  rep  instances)  involving  five  DAs  of  three  quality  levels  each  with  an  increasing  request 
load  pattern  and  a  high  update  load  pattern.  These  test  cases  were  chosen  because  they 
proved  to  be  the  most  difficult  to  solve  in  a  reasonable  amount  of  time. 

Using  SF  reduced  the  update  burden  by  an  average  of  39.0  points  for  all  three  al¬ 
gorithms  which  translates  to  a  4.64%  reduction  on  average  for  these  5  test  cases.  In  all 
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Quals 

Solution  Cost  (Update  Burden) 

# 

Per 

RL 

UL 

AntDA 

# 

DA 

DA 

Patt 

Patt 

# 

Random 

Greedy 

min 

max 

avg 

stdev 

Ants 

1 

478 

336 

328 

335 

330.5 

1.52 

105 

2 

532 

360 

340 

344 

342.38 

1.28 

110 

10 

3 

deer 

low 

3 

489 

332 

325 

326 

325.24 

0.43 

110 

4 

492 

324 

318 

321 

319.58 

0.76 

110 

5 

476 

320 

309 

311 

310.9 

0.36 

110 

1 

2401 

1703 

1635 

1644 

1639.71 

2.90 

135 

2 

2547 

1817 

1719 

1749 

1734.93 

7.87 

140 

10 

3 

incr 

high 

3 

2335 

1676 

1565 

1600 

1584.73 

10.30 

140 

4 

2669 

1899 

1796 

1826 

1812.90 

6.70 

140 

5 

2531 

1791 

1711 

1734 

1725.31 

5.41 

135 

1 

1129 

717 

690 

695 

692.76 

1.19 

215 

2 

1074 

705 

677 

687 

681.96 

2.16 

210 

20 

3 

deer 

low 

3 

1081 

695 

676 

685 

681.66 

2.03 

210 

4 

1069 

699 

667 

676 

670.82 

2.98 

215 

5 

1137 

756 

725 

734 

729.88 

2.24 

215 

Table  9:  Comparison  of  AntDA  to  other  search  algorithms  for  10  and  20  DA  problems. 
The  lowest-cost  solutions  for  each  test  case  are  shown  in  bold  typeface. 


Solution  Cost  (Update  Burden) 

AntDA 

WoFFAntDA 

PD-WoFFAntDA 

# 

Server  Filling 

% 

Server  Filling 

% 

Server  Filling 

% 

Off 

On 

Diff 

Off 

On 

Diff 

Off 

On 

Diff 

1 

823 

784 

4.74 

832 

797 

4.21 

827 

793 

4.11 

2 

841 

811 

3.57 

862 

819 

4.99 

854 

814 

4.68 

3 

793 

764 

3.66 

806 

764 

5.21 

793 

764 

3.66 

4 

852 

813 

4.58 

851 

818 

3.88 

864 

816 

5.56 

5 

852 

811 

4.81 

870 

814 

6.44 

861 

814 

5.46 

Average 

4.272 

Average 

4.946 

Average 

4.694 

Table  10:  The  Effect  of  the  server  filling  algorithm  on  AntDA  and  WoLFAntDA. 


cases,  SF  improves  solution  values.  This  demonstrates  the  importance  of  local  heuristics 
(non-ACO)  for  improving  performance  of  ACO  algorithms  (as  was  also  seen  in  other  ACO 
work  [34,38]). 


4.6  Effect  of  Only  Allowing  the  Top  m  of  Ant  Solutions  to  Deposit  Pheromone 


This  section  discusses  the  impact  of  only  allowing  the  ants  with  the  top  m  solutions 
to  deposit  pheromone  after  each  iteration.  Figures  7  and  8  show  the  average  update  burden 
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Effect  of  Allowing  Top  m%  of  Ant  Solutions  Deposit 


Percentage  of  Ants  Allowed  to  Deposit  Pheromone 

□  AntDA  ■WoLFAntDA  □  PD-WoLFAntDA 


Figure  7:  Effect  of  limiting  the  number  of  ants  allowed  to  deposit  pheromone  on  update 
burden.  Smaller  update  burdens  are  better. 


and  convergence  rates  produced  by  AntDA  and  both  versions  of  WoLFAntDA  for  many 
different  percentages  of  ants  allowed  to  deposit  pheromone. 

Experimental  results  are  shown  for  just  one  test  case  (Problem  #5  of  the  test  cases 
with  five  DAs  of  three  quality  levels  each  with  an  increasing  request  load  pattern  and  a  high 
update  load  pattern).  This  test  case  is  chosen  because  it  proved  to  be  the  most  difficult  to 
solve  in  a  reasonable  amount  of  time,  however  results  are  representative  of  all  test  cases. 

For  AntDA,  only  allowing  the  ants  with  the  top  10%  of  solutions  to  deposit  pheromone 
after  each  iteration  versus  allowing  all  ants  to  deposit  pheromone  decreased  the  average  up¬ 
date  burden  experienced  by  36.1  points  while  decreasing  the  convergence  rate  by  141.16 
time  steps.  This  translates  to  a  decrease  in  average  update  burden  of  4.1%  while  cutting  the 
convergence  rate  by  4.6  times. 

For  WoLFAntDA  and  PD-WoLFAntDA,  allowing  only  ants  with  the  top  20%  of 
solutions  to  deposit  pheromone  after  each  iteration  versus  allowing  all  ants  to  deposit 
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Effect  of  Allowing  Top  m%  of  Ant  Solutions  Deposit 
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Figure  8:  Effect  of  limiting  the  number  of  ants  allowed  to  deposit  pheromone  on  conver¬ 
gence.  Smaller  convergence  rates  are  better. 


pheromone  didn’t  have  as  big  of  an  impact  (probably  due  to  the  addition  of  edge  policy 
values).  However,  it  led  to  a  decrease  of  over  24  points  in  average  update  burden  experi¬ 
enced  and  still  cut  convergence  by  over  2.5  times. 


4.7  Comparison  of  WoLFAntDA  and  PD-WoLFAntDA  To  AntDA 

Tables  1 1  and  12  show  the  performance  of  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA 
for  forty-five  test  cases.  Each  row  of  the  tables  represents  one  test  case.  For  each  test  case, 
the  minimum,  average,  and  standard  deviation  of  the  fifty  solutions  are  shown.  The  lowest- 
cost  solutions  for  each  test  case  are  shown  in  bold  typeface. 

As  seen  in  the  table,  WoLFAntDA  and  PD-WoLFAntDA  find  solutions  comparable 
to  AntDA  for  all  of  the  five  DA  test  cases.  However,  with  the  rise  of  complexity  in  the 
ten  and  twenty  DA  test  cases  (Table  12),  WoLFAntDA  and  PD-WoLFAntDA  begin  to  lose 
ground  on  AntDA,  this  could  be  due  to  the  fact  that  parameter  selection  was  performed 
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Quals 

Solution  Cost  (Update  Burden) 

# 

Per 

RL 

UL 

AntDA 

WoLFAntDA 

PD-WoLFAntDA 

DA 

DA 

Patt 

Patt 

# 

min 

avg 

stdev 

min 

avg 

stdev 

min 

avg 

stdev 

1 

239 

245.68 

1.67 

241 

250 

3.06 

241 

245.68 

2.26 

2 

203 

203.72 

0.97 

203 

206.26 

1.74 

214 

206.14 

1.64 

5 

1 

n/a 

low 

3 

231 

231 

0.0 

231 

233.02 

1.04 

231 

232.9 

2.09 

4 

305 

308.22 

2.40 

304 

312.12 

3.13 

306 

312.48 

21.3 

5 

284 

287.06 

2.27 

284 

290.78 

3.45 

284 

288.26 

2.78 

1 

796 

799.16 

4.70 

800 

824.92 

10.88 

796 

814.20 

13.83 

2 

645 

645.24 

0.66 

645 

649.06 

6.61 

645 

647.50 

5.55 

5 

1 

n/a 

high 

3 

856 

916.00 

24.16 

874 

936.66 

37.18 

856 

920.62 

29.25 

4 

953 

976.1 

19.48 

969 

1010.58 

13.43 

951 

992.56 

19.71 

5 

708 

713.06 

4.25 

717 

733.36 

7.39 

708 

727.32 

9.68 

1 

196 

197 

1.81 

200 

202.26 

2.29 

196 

202.76 

2.47 

2 

160 

160 

0.0 

160 

160 

0.0 

160 

160.04 

0.28 

5 

2 

deer 

low 

3 

217 

217.78 

1.25 

219 

220.72 

0.95 

217 

220.34 

1.12 

4 

155 

155 

0.0 

155 

155.58 

0.50 

155 

155.4 

0.49 

5 

187 

187.02 

0.14 

187 

190.06 

1.19 

187 

189.96 

1.34 

1 

838 

849.12 

2.39 

838 

855.4 

7.94 

838 

852.82 

7.31 

2 

838 

844.80 

3.11 

855 

865.58 

6.70 

850 

862.62 

6.32 

5 

2 

incr 

high 

3 

786 

796.74 

9.17 

786 

811.48 

8.06 

786 

805.30 

8.90 

4 

858 

858.08 

0.40 

858 

867.00 

8.06 

858 

861.58 

5.16 

5 

720 

722.58 

2.89 

721 

728.64 

4.42 

720 

724.68 

3.69 

1 

178 

178.02 

0.14 

178 

179.16 

1.22 

178 

178.90 

1.09 

2 

158 

158.04 

0.20 

158 

160.88 

2.02 

158 

160.32 

1.99 

5 

3 

deer 

low 

3 

154 

154.07 

0.27 

155 

157.11 

1.64 

155 

157.69 

1.58 

4 

142 

142.00 

0.00 

142 

144.34 

1.44 

142 

144.86 

1.75 

5 

145 

146.40 

0.57 

147 

148.32 

1.41 

146 

147.84 

1.28 

1 

784 

793.86 

5.77 

797 

809.06 

5.99 

793 

806.30 

5.99 

2 

811 

817.94 

2.94 

819 

838.04 

8.99 

814 

832.54 

8.79 

5 

3 

incr 

high 

3 

764 

766.06 

2.78 

764 

776.58 

6.44 

764 

776.54 

6.04 

4 

813 

818.98 

2.02 

818 

825.32 

2.97 

816 

824.14 

3.05 

5 

811 

814.64 

2.60 

814 

828.04 

7.50 

814 

823.38 

8.23 

Table  11:  Comparison  of  update  burden  for  WoLFAntDA  and  PD-WoLFAntDA  to 

AntDA  for  5  DA  problems.  The  lowest-cost  solutions  for  each  test  case  are  shown  in  bold 
typeface. 


for  the  five  DA  problems  and  not  re-calibrated  for  larger  problems.  The  policy  values 
could  be  increasing  too  rapidly  on  some  edges  and  slightly  tweaking  the  5i  and  8W  values 
could  cause  the  ants  to  explore  more.  However,  in  the  problem  with  the  largest  difference 
between  AntDA  and  the  other  two  algorithms  (the  20  DA  test  cases),  AntDA’s  minimum 
update  burden  found  was  less  than  3%  lower  than  the  minimum  update  burden  found  by 
both  WoLFAntDA  and  PD-WoLFAntDA.  In  other  words,  the  difference  is  small.  In  five  of 
the  trials,  AntDA  or  WoLFAntDA  found  their  best  solutions  with  a  standard  deviation  of 
zero.  This  improbable  behavior  occurs  when  the  algorithm  finds  the  same  best  answer  after 
400  iterations  every  time. 

A  two-tailed  t-test  was  conducted  to  ensure  that  WoLFAntDA  and  PD-WoLFAntDA’s 
solution  values  were  statistically  different  from  AntDA’s  solutions.  Using  an  alpha  level  of 
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Quals 

Solution  Cost  (Update  Burden) 

# 

Per 

RL 

UL 

AntDA 

WoLFAntDA 

PD-WoLFAntDA 

DA 

DA 

Patt 

Patt 

# 

min 

avg 

stdev 

min 

avg 

stdev 

min 

avg 

stdev 

1 

328 

330.50 

1.52 

332 

336.71 

1.66 

333 

336.32 

1.62 

2 

340 

342.38 

1.28 

349 

354.74 

1.82 

350 

354.59 

2.16 

10 

3 

deer 

low 

3 

325 

325.24 

0.43 

329 

330.10 

0.67 

328 

329.93 

0.83 

4 

318 

319.58 

0.76 

321 

324.03 

1.51 

321 

323.89 

1.74 

5 

309 

310.90 

0.36 

312 

316.89 

1.99 

313 

316.69 

1.62 

1 

1635 

1639.71 

2.90 

1659 

1677.03 

8.61 

1656 

1674.86 

7.58 

2 

1719 

1734.93 

7.87 

1758 

1784.56 

13.18 

1735 

1779.79 

16.85 

10 

3 

incr 

high 

3 

1564 

1584.73 

10.30 

1604 

1635.29 

11.64 

1612 

1629.56 

10.92 

4 

1796 

1812.90 

6.70 

1832 

1858.06 

9.11 

1828 

1857.05 

9.77 

5 

1711 

1725.31 

5.41 

1747 

1772.56 

10.29 

1725 

1768.41 

10.76 

1 

690 

692.76 

1.19 

705 

711.62 

3.43 

706 

710.42 

2.99 

2 

677 

681.96 

2.16 

690 

700.00 

4.37 

697 

700.31 

2.21 

20 

3 

deer 

low 

3 

676 

681.66 

2.03 

687 

693.07 

2.62 

690 

693.75 

2.38 

4 

667 

670.82 

2.98 

677 

684.10 

3.63 

679 

684.60 

2.99 

5 

725 

729.88 

2.24 

745 

747.50 

1.96 

744 

746.60 

1.43 

Table  12:  Comparison  of  update  burden  for  WoLFAntDA  and  PD-WoLFAntDA  to 

AntDA  for  10  and  20  DA  problems.  The  lowest-cost  solutions  for  each  test  case  are  shown 
in  bold  typeface. 


0.05  and  a  degree  of  freedom  of  98  ((50  trials  for  WoLFAntDA  or  PD-WoLFAntDA  -1)(50 
trials  for  AntDA  -1)),  it  was  found  that  in  all  cases  the  probability  that  there  is  no  difference 
between  the  means  is  less  than  0.05.  Therefore,  making  the  results  for  update  burden  and 
convergence  rates  of  WoLFAntDA  and  PD-WoLFAntDA  statistically  significant  compared 
to  the  results  of  AntDA. 

Tables  13  and  14  display  the  convergence  rates  for  the  three  algorithms  for  all  45  test 
cases.  In  the  1  quality  test  cases,  WoLFAntDA  and  PD-WoLFAntDA  converged  at  approx¬ 
imately  the  same  rate  as  AntDA.  However,  as  the  problem  complexity  rose,  WoLFAntDA 
and  PD-WoLFAntDA  converged  at  a  much  faster  rate,  leading  to  a  decrease  in  the  average 
convergence  rate  of  99.13%  in  the  20  DA  test  cases. 

Although  WoLFAntDA  and  PD-WoLFAntDA  produce  solutions  that  are  worse  than 
AntDA,  the  solutions  are  still  better  than  the  solutions  found  by  the  other  solution  methods, 
as  seen  in  tables  15  and  16.  A  speedup  in  convergence  of  over  99%  for  less  than  a  3% 
decline  in  solution  cost  makes  WoLFAntDA  and  PD-WoLFAntDA  much  more  competitive 
with  other  solution  methods.  Using  the  rule-of-thumb  that  95%  of  values  fall  within  two 
standard  deviations  of  their  mean,  this  means  that  for  the  20  DA  test  cases,  AntDA  would 
have  to  be  run  for  385  time  steps  (149  +  2  •  117.82)  to  have  a  95%  confidence  that  it  is 
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Quals 

Convergence  Rate  (Iterations) 

# 

Per 

RL 

UL 

AntDA 

WoLFAntDA 

PD-WoLFAntDA 

DA 

DA 

Patt 

Patt 

# 

min 

avg 

stdev 

min 

avg 

stdev 

min 

avg 

stdev 

1 

i 

3.40 

2.02 

i 

3.80 

3.04 

i 

3.32 

2.80 

2 

i 

8.06 

3.74 

i 

6.72 

10.40 

i 

9.8 

14.21 

5 

1 

n/a 

low 

3 

2 

4.7 

1.18 

i 

3.9 

2.31 

i 

3.3 

2.71 

4 

1 

14.02 

5.16 

i 

5.3 

7.14 

i 

3.92 

5.65 

5 

1 

7.96 

3.48 

i 

3.54 

2.05 

i 

5.72 

3.55 

1 

10 

15.40 

5.43 

2 

6.86 

4.57 

4 

9.26 

4.78 

2 

2 

7.02 

2.43 

1 

5.14 

2.81 

2 

5.78 

3.97 

5 

1 

n/a 

high 

3 

1 

13.46 

32.43 

1 

37.38 

62.46 

1 

37.46 

70.14 

4 

1 

29.92 

64.25 

1 

17.34 

46.52 

1 

41.36 

60.70 

5 

10 

17.24 

5.92 

2 

10.46 

7.70 

2 

14.44 

12.91 

1 

1 

8.9 

22.34 

1 

1.06 

0.24 

1 

1.36 

1.06 

2 

1 

2.44 

0.78 

1 

1.70 

0.46 

1 

1.84 

0.71 

5 

2 

deer 

low 

3 

1 

11.40 

8.49 

1 

1.28 

0.57 

1 

1.60 

1.05 

4 

2 

3.82 

1.93 

1 

2.80 

8.60 

1 

1.80 

1.05 

5 

3 

10.12 

4.21 

1 

2.12 

2.25 

1 

1.66 

1.59 

1 

2 

9.66 

10.74 

1 

5.26 

2.63 

2 

6.00 

2.04 

2 

7 

110.56 

126.70 

1 

2.1 

2.07 

1 

2.52 

2.33 

5 

2 

incr 

high 

3 

1 

19.56 

47.66 

1 

4.16 

2.54 

1 

7.08 

4.62 

4 

2 

6.42 

4.83 

1 

3.88 

2.37 

1 

4.18 

2.32 

5 

2 

16.34 

29.69 

1 

5.34 

3.27 

2 

7.16 

2.50 

1 

2 

25.3 

72.84 

1 

1.68 

0.89 

1 

2.04 

1.12 

2 

2 

6.62 

3.23 

1 

1.98 

1.15 

1 

2.96 

3.73 

5 

3 

deer 

low 

3 

5 

10.98 

10.63 

1 

2.24 

1.86 

1 

2.38 

1.19 

4 

2 

5.32 

4.24 

1 

2.04 

4.95 

1 

1.4 

1.01 

5 

1 

18.14 

56.23 

1 

2.80 

9.29 

1 

2.56 

6.73 

1 

3 

13.38 

11.52 

1 

3.40 

3.12 

1 

5.44 

4.33 

2 

4 

20.82 

11.21 

1 

5.00 

2.99 

1 

7.66 

3.93 

5 

3 

incr 

high 

3 

1 

5.36 

3.50 

1 

3.14 

2.46 

1 

2.64 

2.36 

4 

4 

34.74 

63.95 

1 

3.40 

2.33 

1 

6.22 

10.08 

5 

7 

16.58 

27.56 

1 

5.96 

2.86 

2 

8.70 

3.66 

Table  13:  Comparison  of  convergence  rates  for  WoLFAntDA  and  PD-WoLFAntDA  to 
AntDA  for  5  DA  problems.  The  lowest  average  convergence  rate  for  each  test  case  are 
shown  in  bold  typeface. 


finding  the  best  solutions  possible.  With  an  average  of  a  minute  per  time  step  for  20  DA 
problems,  it  would  take  AntDA  almost  6.5  hours  to  find  its  best  solution.  Using  either 
WoLFAntDA  or  PD-WoLFAntDA,  for  the  same  test  cases,  it  requires  runs  of  a  mere  2  time 
steps  (1.22  +  2  ■  0.383).  It  would  therefore  only  take  WoLFAntDA  or  PD-WoLFAntDA  two 
minutes  to  find  its  best  solution. 

It  is  interesting  to  note  that  test  cases  with  a  high  update  load  have  higher  standard 
deviations  for  cost  (Tables  11  and  12)  and  convergence  (Tables  13  and  14).  This  is  due  to 
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Quals 

Convergence  Rate  (Iterations) 

# 

Per 

RL 

UL 

AntDA 

WoLFAntDA 

PD-WoLFAntDA 

DA 

DA 

Patt 

Patt 

# 

min 

avg 

stdev 

min 

avg 

stdev 

min 

avg 

stdev 

1 

4 

210.88 

110.76 

i 

1.22 

1.56 

i 

1.05 

0.21 

2 

6 

172.12 

110.24 

i 

1.13 

0.34 

i 

1.20 

0.46 

10 

3 

deer 

low 

3 

20 

130.58 

91.59 

i 

1.45 

0.50 

i 

1.61 

0.58 

4 

3 

16.64 

23.82 

i 

1.03 

0.16 

i 

1.26 

0.44 

5 

3 

52.08 

68.75 

i 

1.07 

0.25 

i 

1.33 

0.61 

1 

61 

137.39 

49.81 

i 

1.83 

0.86 

i 

2.86 

2.66 

2 

19 

221.00 

86.70 

i 

6.92 

4.71 

i 

8.54 

6.09 

10 

3 

incr 

high 

3 

21 

203.65 

102.68 

i 

4.32 

3.45 

2 

6.85 

5.11 

4 

11 

159.48 

68.74 

i 

1.83 

1.56 

1 

2.57 

2.35 

5 

22 

217.37 

74.73 

i 

2.54 

1.89 

1 

2.61 

3.33 

1 

10 

195.84 

112.06 

i 

1.46 

0.52 

1 

1.25 

0.45 

2 

9 

186.38 

114.72 

i 

1.08 

0.28 

1 

1.31 

0.48 

20 

3 

deer 

low 

3 

5 

126.46 

120.26 

i 

1.14 

0.36 

1 

1.00 

0.00 

4 

6 

98.29 

123.40 

i 

1.20 

0.42 

1 

1.10 

0.32 

5 

5 

137.70 

118.67 

i 

1.40 

0.52 

1 

1.30 

0.48 

Table  14:  Comparison  of  convergence  rates  for  WoLFAntDA  and  PD-WoLFAntDA  to 
AntDA  for  10  and  20  DA  problems.  The  lowest  average  convergence  rate  for  each  test  case 
are  shown  in  bold  typeface. 


the  fact  that  in  the  high  UL  pattern,  update  loads  are  boosted  so  that  some  servers  are  not 
able  to  host  high-quality  replicas  of  some  DAs.  This  causes  more  difficulty  in  solving  the 
DA  rep  problem  therefore  causing  a  bigger  disparity  in  solutions  and  convergence  rates. 

There  were  no  big  differences  in  the  performances  of  WoLFAntDA  and  PD-WoLFAntDA. 
Although  WoLFAntDA  had  the  lowest  average  in  convergence  rate  more  often,  PD-WoLFAntDA 
was  never  far  off.  Additionally,  each  found  approximately  the  same  update  burden  for  each 
test  case.  This  shows  that  both  rules  for  determining  whether  an  agent  is  winning  or  losing , 
once  tuned  correctly,  can  be  used  to  effectively  solve  the  Quality-Sensitive  DA  Replication 
Problem. 
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Quals 

Solution  Cost  (Update  Burden) 

# 

Per 

RL 

UL 

WoLF¬ 

PD-WoLF- 

DA 

DA 

Patt 

Patt 

# 

Random 

Greedy 

LINGO 

AntDA 

AntDA 

AntDA 

1 

305 

258 

246+ 

239 

241 

241 

2 

254 

225 

203+ 

203 

203 

214 

5 

1 

n/a 

low 

3 

298 

243 

231+ 

231 

231 

231 

4 

379 

323 

313+ 

305 

304 

306 

5 

351 

307 

292+ 

284 

284 

284 

1 

930 

860 

821+ 

796 

800 

796 

2 

698 

659 

656+ 

645 

645 

645 

5 

1 

n/a 

high 

3 

895 

894 

856+ 

856 

874 

856 

4 

998 

983 

964+ 

953 

969 

951 

5 

810 

761 

710+ 

708 

717 

708 

1 

259 

211 

206* 

196 

200 

196 

2 

188 

164 

166* 

160 

160 

160 

5 

2 

deer 

low 

3 

271* 

226 

230 

217 

219 

217 

4 

169 

157 

156* 

155 

155 

155 

5 

230 

194 

193* 

187 

187 

187 

1 

1070 

890 

829* 

838 

838 

838 

2 

990 

809 

831* 

838 

855 

850 

5 

2 

incr 

high 

3 

999 

819 

781* 

786 

786 

786 

4 

1167 

957 

1002* 

858 

858 

858 

5 

974 

809 

832* 

720 

721 

720 

1 

242 

206 

237++ 

178 

178 

178 

2 

220 

186 

215++ 

158 

158 

158 

5 

3 

deer 

low 

3 

186 

155 

166++ 

154 

155 

155 

4 

177 

151 

158++ 

142 

142 

142 

5 

196 

171 

176++ 

145 

147 

146 

1 

1057 

842 

961** 

784 

797 

793 

2 

1135 

884 

940** 

811 

819 

814 

5 

3 

incr 

high 

3 

1048 

788 

907** 

764 

764 

764 

4 

1099 

849 

885** 

813 

818 

816 

5 

1137 

867 

913** 

811 

814 

814 

Table  15:  Comparison  of  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  to  other  search 
algorithms  for  5  DA  problems.  Depending  on  the  Problem,  LINGO  was  let  run  for  differing 
amounts  of  time:  (+)  =  1  hour,  (*)  =  2  hours,  (++)  =  4  hours,  and  (**)  =  300  hours.  Only 
the  best  solution  found  for  each  method  are  shown.  The  best  solutions  for  each  problem 
instance  are  shown  in  bold  typeface. 
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Quals 

Solution  Cost  (Update  Burden) 

# 

Per 

RL 

UL 

WoLF¬ 

PD-WoLF- 

DA 

DA 

Patt 

Patt 

# 

Random 

Greedy 

AntDA 

AntDA 

AntDA 

1 

478 

336 

328 

332 

333 

2 

532 

360 

340 

349 

350 

10 

3 

deer 

low 

3 

489 

332 

325 

329 

328 

4 

492 

324 

318 

321 

321 

5 

476 

320 

309 

312 

313 

1 

2401 

1703 

1635 

1659 

1656 

2 

2547 

1817 

1719 

1758 

1735 

10 

3 

incr 

high 

3 

2335 

1676 

1565 

1604 

1612 

4 

2669 

1899 

1796 

1832 

1828 

5 

2531 

1791 

1711 

1747 

1725 

1 

1129 

717 

690 

705 

706 

2 

1074 

705 

677 

690 

697 

20 

3 

deer 

low 

3 

1081 

695 

676 

687 

690 

4 

1069 

699 

667 

677 

679 

5 

1137 

756 

725 

745 

744 

Table  16:  Comparison  of  AntDA  WoLFAntDA  and  PD-WoLFAntDA  to  other  search 
algorithms  for  10  and  20  DA  problems.  Only  the  best  solution  found  for  each  method  are 
shown.  The  best  solutions  for  each  problem  instance  are  shown  in  bold  typeface. 
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V.  Conclusion 


This  thesis  effort  examined  several  aspects  of  the  Quality-Sensitive  DA  Replication  Prob¬ 
lem  (DA rep).  In  this  problem,  the  ASP  must  assign  DA  replicas  to  its  network  of  heteroge¬ 
neous  servers  so  that  user  demand  is  satisfied  at  the  desired  quality  level  and  replica  update 
loads  are  minimized.  It  then  proposed  three  simple  algorithms  for  solving  it,  and  vali¬ 
dated  and  analyzed  the  performance  of  the  proposed  algorithms  compared  to  other  search 
algorithms. 

5.1  Contributions  and  Achievements 

Major  accomplishments  and  achievements  of  this  thesis  investigation  include  the  fol¬ 
lowing. 

1.  DA  rep  is  thoroughly  discussed  and  the  practical  impediments  to  its  solution  were 
identified  (Chapter  II). 

2.  Problems  similar  to  DA  rep  were  reviewed  and  reasons  why  solutions  to  those  prob¬ 
lems  are  ill-suited  for  DA  rep  are  discussed  (Chapter  II). 

3.  The  ant  colony  optimization  (ACO)  meta-heuristic  is  discussed  and  has  been  shown 
to  be  successful  in  solving  difficult  discrete  optimization  problems  (Chapter  II).  An 
ant  colony  algorithm,  AntDA,  originally  proposed  in  [53],  is  further  investigated  and 
results  on  its  performance  reported.  Highlights  in  the  AntDA  investigation  include 
the  following  (Chapter  IV). 

(a)  Better  values  for  the  tunable  parameters  were  determined  and  were  shown  to 
lead  AntDA  to  better  solutions  and  faster  convergence. 

(b)  Limiting  the  number  of  ants  depositing  pheromone  at  the  end  of  a  time  step 
found  that  only  allowing  the  ants  with  the  top  10%  of  solutions  to  deposit 
pheromone  led  to  a  convergence  rate  of  over  4.5  times  faster  than  allowing  all 
ants  to  deposit  pheromone  while  reducing  update  burden  by  4%.  This  occurs 
due  to  the  reinforcement  of  the  edges  which  are  used  in  the  best  solutions  which 
allowed  the  ants  to  search  in  the  area  of  good  solutions. 
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(c)  Ants  are  allowed  to  invoke  a  Server  Filling  replica  creation  policy  when  creating 
a  replica  on  a  server.  This  led  to  a  reduction  in  update  burden  by  more  than 
4.5%  on  average.  It  does  this  by  assigning  additional  qualities  of  a  (d,  q)  pair  to 
a  server  with  remaining  capacity  which  keeps  system-wide  update  burden  low. 

4.  In  order  to  help  AntDA  converge  quicker  and  find  better  solutions  to  more  com¬ 
plex  problems,  it  was  combined  with  the  variable-step  policy  hill-climbing  algorithm 
called  Win  or  Learn  Fast  (WoLF)  to  create  two  algorithms,  WoLFAntDA  and  PD- 
WoLFAntDA.  Both  algorithms  are  discussed  (Chapter  III)  and  results  on  their  perfor¬ 
mance  reported  (Chapter  IV).  Highlights  in  the  WoLFAntDA  and  PD-WoLFAntDA 
investigation  include  the  following. 

(a)  Better  values  for  the  tunable  parameters  are  determined  for  the  DA  rep  prob¬ 
lem  and  are  shown  to  lead  WoLFAntDA  and  PD-WoLFAntDA  to  converge  very 
rapidly  with  better  solutions. 

(b)  Limiting  the  number  of  ants  depositing  pheromone  at  the  end  of  a  time  step  was 
also  experimented  with  for  WoLFAntDA  and  PD-WoLFAntDA.  This  time,  it 
was  found  that  only  allowing  the  ants  with  the  top  20%  of  solutions  to  deposit 
pheromone  led  to  a  decrease  in  convergence  rate  of  over  2.5  times  compared  to 
allowing  all  ants  to  deposit  pheromone. 

(c)  The  number  of  ants  allowed  to  update  an  edge’s  policy  values  was  also  exper¬ 
imented  with.  Results  show  that  only  allowing  the  ants  with  the  top  10%  of 
solutions  to  update  policy  values  led  to  a  decrease  in  convergence  rate  while 
keeping  update  burden  below  other  solution  methods. 

(d)  WoLFAntDA  and  PD-WoLFAntDA  allowed  the  convergence  rates  to  solve  the 
most  complex  problem  to  be  decreased  by  over  99%  compared  to  AntDA  while 
only  finding  solution  values  of  less  than  3%  higher  on  average. 

(e)  The  addition  of  the  learning  algorithm  into  AntDA,  allowed  the  ACO  heuristic 
to  be  applied  to  more  complex  problems  while  still  being  able  to  be  solved  in  a 
reasonable  amount  of  time.  With  WoLFAntDA  or  PD-WoLFAntDA,  a  20  DA, 
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3  quality  problem  (the  hardest  tested)  could  find  a  better  solution  than  the  other 
search  algorithms  tested  by  the  second  iteration  on  average.  Therefore,  bringing 
its  run  time  down  to  just  two  minutes,  instead  of  hours  with  AntDA. 

5.2  Future  AntDA  Work 

Incorporating  the  learning  algorithm  into  AntDA  allowed  WoLFAntDA  to  address 
the  issue  of  scalability,  which  is  one  of  the  most  significant  drawbacks  of  ant-based  algo¬ 
rithms.  It  allowed  AntDA  to  be  useful  for  realistically- sized  problems  (20  DAs)  by  improv¬ 
ing  convergence  rates  but  failed  to  converge  to  the  best  answers  found  by  AntDA.  In  this 
regard,  further  research  and  testing  needs  to  be  done.  The  main  goal  of  any  further  work 
should  be  to  get  the  AntDA  algorithm  to  keep  the  convergence  rate  of  WoLFAntDA  and 
PD-WoLFAntDA,  but  to  improve  solution  values.  The  following  list  describes  some  ideas 
for  future  work. 

1.  Starting  ants  at  an  artificial  start  node  (ants  learn  which  (d,q)  pair  to  assign  first), 
instead  of  positioning  them  at  a  randomly  chosen  server  vertices,  may  have  an  impact. 
However,  since  the  halting  criteria  is  examined  when  moving  from  S  to  DQ  vertices, 
starting  ants  at  a  server  vertex  is  probably  wise. 

2.  Implement  Equation  19  (Chapter  III)  so  that  the  top-scoring  solutions  deposit  pro¬ 
portionally  more  pheromone  on  edges  than  low-scoring  solutions. 

3.  Implement  Equation  28  (Chapter  III)  so  that  the  top-scoring  solutions  can  add  or 
subtract  proportionally  more  policy  on  edges  than  low-scoring  solutions. 

4.  Develop  a  visualization  tool  so  that  the  graph  state  and  algorithm  behavior  can  be 
monitored.  Such  a  tool  would  allow  the  researcher  to  better  examine  the  impact  of 
parameter  values  and  algorithmic  problem  areas  that  could  be  further  addressed. 

5.  Adapt  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  for  use  in  dynamic  environments. 

6.  Allow  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  to  run  on  better  hardware  and  be 
applied  to  more  complex  problems. 
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7.  Compare  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  with  other  stochastic  tech¬ 
niques  such  as  breadth-first  search,  depth-first  search  in  order  to  ensure  the  worth  of 
these  three  algorithms  in  solving  the  Quality-Sensitive  DA  Replication  Problem. 

8.  Adapt  AntDA,  WoLFAntDA,  and  PD-WoLFAntDA  to  be  run  in  parallel  to  allow  for 
faster  results.  These  algorithms  are  essentially  in  this  form  already  (since  each  ant 
solves  on  its  own  for  each  time  step  on  a  map  that  is  only  updated  between  time  steps 
and  redistributed). 

9.  Finally,  future  work  should  include  the  application  of  ACO  and  WoLF  in  combina¬ 
tion  to  other  algorithms  such  as  Quadratic  Assignment  Problem  (QAP),  the  Vehicle 
Routing  Problem,  and  many  other  problems  which  ACO  has  already  been  applied. 
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